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THIS Is A PRACTICAL GUIDE TO COLLEGE TESTING 
for teachers and administrators who recog- 
nize the importance of tests in focusing at- 
tention on the individual student and in ap- 
praising institutional programs but who lack 
formal training in evaluative techniques. This 
book analyzes current practices in the use of 
tests for various objectives and provides de- 
scriptive accounts of testing programs in rep- 
resentative colleges and universities. 


How tests are used in conjunction with 
other data in admissions, placement, instruc- 
tion, and counseling; what current tests can 
and cannot measure; how they are used to 
evaluate the attainment of institutional goals; 
what kinds of tests are available and where 
they may be obtained—these are some of the 
topics here considered that will help the non- 
specialist to use and interpret tests more effec- 
tively. 


Here are practical suggestions for setting up 
or improving a testing program, including the 
selection of tests to meet the educational pur- 
poses of the institution, the administrative 


organization of the program, and the mechan- 
ics of testing. 


The testing programs of seven institutions, 
selected for their diversity in type of organiza- 
tion, size of enrollment, and educational aims 
are presented by specialists in the field, They 
offer concrete examples of programs in action 
and reflect a Versatility in the use of tests and 
in the operation of evaluation programs which 
will be provocative to all educators. 
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Foreword 


IN THE APPRAISAL OF INTELLECTUAL QUALITIES AND ATTAINMENTS, 
tests have a contribution to make that may seem self-evident today, 
when formal testing procedures are being widely used in the ele- 
mentary and secondary schools, at the point of admission to col- 
lege, in placement within the college program, in appraising indi- 
vidual and group achievement in college, at the point of entry into 
graduate and professional schools, in conjunction with entry into 
professional practice, and in many other ways. 

Indeed, in the judgment of the members of the Committee on 
Measurement and Evaluation, a major contemporary problem in 
education is that the American public—on and off the college 
campus—seems too ready to assume that test procedures and test 
results can provide complete and final answers to the many serious 
problems of evaluation that we face. 

To the extent that testing helps to focus attention on the in- 
dividual in our educational processes, we can all applaud. But 
whether individuals are really helped or harmed will depend on 
the ways in which tests are used and interpreted. In a day in which 
testing is applied from the cradle to the grave, it is of crucial im- 
portance that those who use the tests understand what they can 
and do show and what they cannot and do not show. 

This book has been prepared for the college teacher and ad- 
ministrator who is now concerned—and likely to be more so—with 
testing procedures and materials but who is without formal train- 
ing in the techniques of testing. The Committee on Measurement 
and Evaluation hopes that it will assist teachers and administrators 
alike to put tests to use in ways that are appropriate and that con- 
tribute to their efforts to keep the individual student at the center 
of the stage in all of his own uniqueness. 

Many individuals have aided in the preparation of this book, 
as the chairman of the committee, Paul R. Anderson, has pointed 
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out in his Preface. The Educational Testing Service encouraged 
and aided the project in important ways; the W. T. Grant Founda- 
tion supplied funds which helped make publication possible. To 
all of these, the committee and the Council are grateful. 
ARTHUR S. ADAMS, President 
American Council on Education 


Preface 


THE USE OF MEASUREMENT DEVICES IN INSTITUTIONS OF HIGHER 
learning has increased by leaps and bounds since the war, partly 
stimulated by the widespread development and use of tests by the 
military services in the early 1940’s. Expanding enrollments in the 
1960’s will in all likelihood result in further extension of this prac- 
tice. 

The refinement of evaluation instruments has developed more 
rapidly than the sophistication of the consumer. Graduate pro- 
grams, with few exceptions, fail to provide training in the arts and 
techniques of effective instruction. The result is that all too many 
professors and administrators possess little or no understanding of 
tests, their applicability, their range of utility, or their limitations. 
Lack of knowledge rather than of interest accounts both for failure 
to take advantage of many splendid instruments currently available 
and also for some misuse of them. 

This situation stimulated the Committee on Measurement and 
Evaluation of the American Council on Education to consider the 
advisability of preparing a manual for teachers and others not 
expert in testing covering the principal characteristics and uses of 
tests in various areas such as admissions, placement, and instruc- 
tion. The present volume is the result. Its purpose is to serve not 
the specialist in the field but a large group of people in higher edu- 
cation who would like to make greater use of evaluation instru- 
ments and yet who know relatively little about either their poten- 
tial or their limitations. This guide should provide such a person 
with a background of knowledge in the field and lead him to refer- 
ences through which he can carry his inquiry as far as he likes. 

The project was administered under an editorial subcommittee 
composed of E. F. Lindquist, C. Robert Pace, Ralph W. Tyler, and 
the writer working cooperatively with staff members of the Edu- 
cational Testing Services ably marshaled by Anna Dragositz. Ideas 
and initial drafts were provided by William E. Coffman, Anna 
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Dragositz, John S. Helmick, A. Pemberton Johnson, Gerald V. 
Lannholm, Richard Pearson, and Barbara Wagner. In order to 
provide unity of style and an approach to the material based upon 
campus experience and possible use, Lily Detchen, director of 
evaluation services at Chatham College, did the final editing. Col- 
lege Testing: A Guide to Practices and Programs is thus a product 
of a number of minds and we hope the better for being so. 


From the beginning it was felt that a practical 


guide should in- 
clude representativ 


e examples of programs in action. Part II of the 
volume is composed of descriptive articles by persons intimately 
involved in the programs in seven institutions. In contrast to Part 
I, which the Committee on Measurement and Evaluation has stud- 
ied and approved, Part II is the product of individual authors. 

The Committee on Measurement and Evaluation acknowledges 
with deep appreciation the contribution which the various plan- 
ners, authors, and editors have made. Without their deep interest 
in the project and their uncommon willingness to have their own 
contributions reviewed, revised, and rewritten in line with com- 
mittee desire, this volume could never have been possible. 

It is the committee's hope that readers will find here materials 
helpful to the resolution of teaching and other educational prob- 
lems in their own institutions and that this volume may therefore 
Serve a useful purpose in the improvement of higher education. 


PAUL R. ANDERSON, Chairman 


Committee on Measurement and Evaluation 
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PART | 


The Role of Measurement in Relation 


to Educational Problems of the College 


l. The Status and Basic Purposes 
of College Testing 


THE USE OF TESTS AT THE COLLEGE LEVEL HAS INCREASED PHENOME- 
nally in recent years and foreseeable pressures indicate an even 
greater need of tests in the future. Increases in the college-age popu- 
lation, improvement in the socioeconomic status of American 
families, trends in the general economy that encourage young 
people to remain in school, the ever-growing conviction that more 
of our high school graduates are entitled to a higher education, 
and the need of our nation for college-educated people—all these 
are factors in the greater activity in testing. One major college 
admissions testing agency reports that its activities increased by 
more than 500 percent in a recent eight-year period, and more 
recently increased by 50 percent from one year to the next. State 
testing programs have seen rapid growth in twenty-six states, 
usually at the twelfth-grade level, although the trend is toward 
beginning testing at earlier levels. Other states have similar pro- 
grams under consideration. 

In addition to the admissions and placement tests so widely used, 
many other kinds of tests have been developed for additional ed- 
ucational purposes. Such interest has emerged from a widespread 
effort in higher education to rethink the objectives of institutional 
curricula and services. This has frequently resulted in the intro- 
duction of programs of general education, in improving the selec- 
tion of students for professional and other specialized programs, 
and in general in a greater consideration of the students as in- 
dividuals. In the course of instituting changes, a need evolved for 
evaluating the result of such changes, hence the development of 
tests for special purposes. 

Interest in the theory and methodology of testing at the college 
level was given great impetus by the Thurstones at the University 
of Chicago in the early twenties when they introduced “psycho- 
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logical aptitude” testing, and a little later developed the American 
Council on Education Psychological Examination for College 
Freshmen, which was first used as a selective admissions test at the 
University of Chicago and subsequently adopted by hundreds of 
other higher institutions, The ACE Psychological Examination 
group intelligence testing in 
Jar I which had helped solve 
specialized branches of the 
was the author of many of 
€rest continued unabated in 
d wars and he inspired and 
thers which improved both 


group became the focus of testing. In 
cial or “differential” aptitudes 
nt testing, diagnostic measures 
d weaknesses of the individual, 


er vague “general” content, and 


; the standpoint of closer descrip- 
tions of student groups were stressed, 
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persons—both teachers and administrators—would like to use tests 
but don’t quite know how to go about it. The discussions here are 
designed for this group. The aim is to provide them with enough 
background to serve their purpose, enough description of content 
and method to enable them to choose the test most useful to them, 
enough of the technique for practical use, and enough information 
in general to enable them to distinguish between the tests useful 
to nonspecialists and those which require the closer attention of 


the specialist. 


THE KINDS OF TESTS IN GENERAL USE 

Any discussion of the status of college testing may well begin 
with a description of the kinds of tests and examinations that are 
now in fairly general use. There are several categories of these, 
generally designated as scholastic aptitude, achievement, interests, 
personal adjustment, and special abilities (or aptitudes) tests. 
These tests may have been developed by classroom instructors or 
college departmental staffs; by nonprofit research organizations 
or other groups representative of education; by research centers 
attached to universities that receive commissions to produce tests 
needed for particular purposes; by individuals who secure publi- 
cation through commercial publishers; by publishing houses that 
maintain staffs for this purpose; by groups of college faculty mem- 
bers who cooperate under the auspices of some special project to 
produce tests they need in common; by boards of professional 
examiners attached to universities, to civil service examining units, 
to state departments of education and to city educational systems, 


and so on. 


Scholastic aptitude tests 

Just what academic learning ability is or whether it exists in any 
absolute sense is difficult to say, but the day has long since passed 
when measurement specialists say they are attempting to measure 
any unique quality called “intelligence” or any abstract psycholog- 
ical factors which are attributes of unique mental functions, al- 
though in the early days of measurement it was hoped that this 


might be possible. Experience has proved otherwise. 
From the beginning of mental testing, constructors of tests have 
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tended to avoid the bizarre and have w 
that the best way to obtain an index of 
to incorporate in the test problems onl 
experience—which actually meant usu 
school activities. The result was th 
school success better than success in 
intellectual ability. Thus, 
termed scholastic aptitude tests than intelligence tests. 

The change in name from intelligence tests to scholastic aptitude 
tests indicates, at the higher academic levels at least, a change in 
emphasis in the test content, When the problem is that of predict- 
ing academic success, consideration is given to the abilities which 
are requisite to academic success, For most school learning situa- 
tions, these abilities are firmly based upon verbal facility and 
quantitative reasoning, though other abilities are required to a 
lesser degree, 

Thus, today’s aptitude tests in intellectual areas place emphasis 


on reading, vocabulary skills, and understanding; on arithmetic 
reasoning and problem-soly 


of these are learned abilities, so that these tests in a certain sense 


orked on the assumption 
relative ability to learn is 
y materials from common 
ally drawn from common 
at these early tests predicted 
other activities which require 
these tests might better have been 


ce they measure generalized abilities 


developed over a long period of time rather than the skills and 


specific to a circumscribed body of 
knowledge. 
Because commonly used scholastic aptitude tests are measures of 
achieve t in a broad sense o 


) may be operating. 
such as previous school 
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do work at the college level, but simply has not had the quality 
of instruction to prepare him for the kinds of tasks the test sets. 
What now? May the score be looked upon as an indication of 
scholastic aptitude? In a certain sense, yes. If the student is apply- 
ing to a college that, on the one hand, has very high academic 
standards and, on the other, makes no provision for remedial work 
in basic skills, the chances are that the student will find the going 
very difficult unless he is able to compensate for his shortcomings 
by great additional effort. No amount of sympathetic understand- 
ing will enable him to undertake work for which he is not ready. 
In situations where special remedial work is offered to students 
who are deficient in reading or numerical skills, for example, the 
score may operate less effectively as an indicator of probable scho- 


lastic success. 


Achievement tests 

Achievement tests measure proficiency in defined areas of learn- 
ing. Normally, the tests are taken after a prescribed period of 
study, and results are expected to indicate how well the subject 
has been mastered. 

Currently, there are two kinds of achievement tests: tests which 
measure learning in specific courses such as trigonometry, Latin, 
American history, and so on, and general achievement tests which 
cut across specific course boundaries to measuré learning attained 
in a broad field of study such as the humanities, social studies, or 
science. Both types might be further classified into those em- 
phasizing factual information and those which measure under- 
standing and the ability to apply to new situations the skills and ` 
principles learned. The latter, of course, are likely to be of greater 


interest at the college level. 


Interests tests 


Although a review of academic records and scholastic aptitude 
tests is useful in appraising student interests, there are certain 
rather obvious shortcomings in relying on them too heavily. The 
meaning of grades is actually obscure because so many factors are 
involved—native ability, the enthusiasm that teachers have en- 
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gendered in students, and the subjective judgments of the ers 
to name but a few. It is useful to supplement grades and scholas 
aptitude tests with evidences of interests from other ee sii 

Among the most useful sources are interest inyentewles, whi f 
are designed to reveal information on an individual s predominan 
interest patterns. Some are intended to appraise academic m 
but the most commonly used measures are based on vocationa 
interest patterns. Here, two general types are available: one type, 
of which the Strong Vocational Interest Blank is an example, is 
scorable in terms of specific jobs or occupations such as lawyer, 
engineer, minister, journalist, and so on. The other, of which the 
Kuder Preference Record is an example, yields scores related to 
broad areas of interest, such as scientific, literary, or persuasive, 
which, in turn, may be related by a counselor to different cate- 
gories of jobs and professions. 

Tests for specific occupations are generally more time-consum- 
ing and are expensive to score. If only a general idea of predomi- 
nant interests is needed, a broad-area inventory may be all that 
is indicated. In some situations it may be useful to administer both 
types. Also, it should be noted that some appear to work better 
with younger students than others and some seem 
ful with one sex than the other sex, The decisio 
terest inventory at all should rest primarily on 


lege has a staff member who is trained in the us 
tion of such instruments, 


to be more use- 
n to use any in- 
whether the col- 
e and interpreta- 


Personal and social adjustment tests 
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Most college students do not fall into this category. They are, by 
and large, normal, healthy, maturing young people, and the kind 
of information needed is that which will reflect their personal 
qualities and be useful to them in making long-range personal 
and vocational choices. 

Inventories planned to yield information of this sort are avail- 
able. Unfortunately, however, a number of them have grown out 
of clinically oriented situations and have little meaning except to 
clinicians. The questionable selection and definition of aspects of 
personality selected for scoring is not, however, the only limitation 
such inventories have. A more basic limitation, even for the 
trained user, is a lack of data establishing the extent to which their 
measurements are valid. 


Measurements of special abilities and aptitudes 


A host of tests has been developed to measure artistic, musical, 
scientific, clerical, and mechanical aptitudes and other special 
abilities.t Also, there are measures of visual discrimination, verbal 
facility, numerical ability, memory ability, figure relations, finger 
and manual dexterity, and so forth. Some are presented in the 
familiar paper-and-pencil form of objective tests; others are per- 
formance tests which require the individual to work directly with 
special materials, such as might be needed to assemble a lock or 
to manipulate small objects quickly. Some have been developed on 
the basis of specific job specifications and have been standardized 
on employed groups. Others have grown out of attempts to isolate 
and measure specific abilities which might be required for success 
in different types of work. Some have been related to specific kinds 
of vocational training, principally at the secondary school level, 
and provide norms for such groups. But few as yet have been de- 
veloped and standardized for college populations. 

By and large, traditional college populations will have less use 
for measures of this type than will terminal-vocational programs. 
For either group it is best to determine first whether any of the 


t Dewey B. Stuit, et al, Predicting Success in Professional Schools. The authors 
discuss prediction in engineering, law, medicine, dentistry, music, agriculture, 
teacher training, and nursing; they review research findings and state implications 


for counseling under each of these major findings. 
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basic aptitude and skills tests are likely to do as good a job of pre 
dicting success in a special field of study as will measures of more 
specific abilities. There are several reasons for this. First, many 
courses, despite their intent to provide training for specific oc- 
cupations, depend largely on lectures and reading assignments and 
thereby place a premium on verbal ability. Second, some measures 


of so-called special aptitudes duplicate to quite a large degree the 
kinds of materials included in me 


tude. Third, those that have bee 
with employed workers are not 
particularly when the number of 
for developing adequate local n 
ticularly performance tests, are 
interpret. Colleges which, after 
deem it desirable to employ th 
prepared for their lack of norms 
ity for college students and shou 
cial studies of their effectiveness, 


asures of general scholastic apti- 
n developed principally for use 
appropriate for college students, 
students in a program is too small 
orms. Fourth, some of them, par- 
rather complex to administer and 
a thoughtful review of such tests, 
em in their programs, should be 
and data on validity and reliabil- 
Id be prepared to undertake spe- 


PURPOSES SERVED BY TESTS 


If held to their basic function as tools, tests can be of immense 
assistance to an instituti 


tion, and the Process of defining 


; goals is most complex, but the 
importance of th 


be overemphasized.? Only then 


“For an informative account of how some cooperating college faculties pursued 


the problem of defining institutional objectives, see the report by Paul L. Dressel 
and Lewis Mayhew, General Education: Expl i 


similar earlier work published by 
in General Education: ii 
operative Study in G 
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can tests be used to appraise the measure of achievement of those 
goals. 


Tests aid the administrator 

An effective testing program can serve the administration of a 
college in meeting many of its responsibilities. To suggest but a 
few: test results can help find the answers to such questions as: 
Do the current offerings fulfill the academic aims of the institu- 
tion? Does the institution tend to emphasize some of its aspects to 
the detriment of others? Is the student body suitable for the kind 
of program that is offered, and vice versa? Are scholarship funds 
being most wisely dispensed? Tests rarely, if ever, supply exact 
answers to such questions, but they help to focus attention on part- 
ticular issues and provide descriptive data. In a well-conceived pro- 
gram, the interpretation of test data contributes to understand- 
ing what is needed for the solution of the problem. The use of tests 
in general institutional evaluation is the subject of chapter 6. 


Tests aid the instructional staff 

Since faculty members are vitally involved in the achievements 
of their institution, they need answers to many of the same kinds 
of questions that press the administrators. Their interests, how- 
ever, have a narrower focus, for their concern with over-all achieve- 
ment is likely to be more closely related to their personal ob- 
Jectives in instruction. 

When faculty members understand how to appraise results of 
tests and testing programs, they can use the information in a 
variety of ways. Many an instructor has simplified his problems 
by instituting homogeneous groupings in his classes—groupings 
made possible by analysis of results of tests of student aptitude, 
achievement, or skills. Again, instructors, being generally highly 
sensitive and moral individuals who are concerned that their grad- 
ing practices be fair, are interested in perfecting their evaluation 
procedures. The faculty member assigned to a committee working 
on some institutional problem—for example, the determination 
Of criteria for selective admission of students to specialized fields, 
Say in engineering and education—can find test data highly re- 
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vealing. The instructor who wishes to determine the effects of a 
given educational device—say, the effect on attitudes of an educa- 
onal film on race relations as compared with a lecture on the 
subject—will find that test results will supplement and perhaps 
modify his subjective impressions. ; 
Some experienced instructors maintain that the kind of test 
used influences dramatically the emphasis that a student gives to 
his study. A whole department can be revitalized by the use of 
special “major” examinations with students trained in the de- 
partment: if the examinations are constructed by the faculty 
bers or even by a board at the institution, teaching objectives are 
sharpened in the process; if the tests are externally prepared, the 
instructors may have either the reassurance of good results or may 
be challenged by the inadequate performance of their students to 
re-examine their materials or approach. 
going outcomes of testing may be to the 
greatest value appears when he is stimu- 
nstructing tests or examining test results, 
y of his own teaching objectives, the de- 
s his aims, and the appropriateness of his 
and then to revise his aims and methods 
ect is given further consideration in chap- 


mem- 


to re-examine the validit 
gree to which he achieve: 
instructional materials, 


accordingly.? This subj 
ter 4, 


Tests aid the student 
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his progress: the results of a diagnostic reading test or an inventory 
of study habits and accompanying counseling interviews are first 
steps toward remedying a shortcoming. 

In course studies, if the student is led to expect tests which are 
challenging, they can stimulate his interest in the subject and 
cause him to improve both his study habits and methods of prepa- 
ration. Or tests may obviate his having to repeat work that he has 
already mastered and enable him to go on to more challenging 
studies. On a different level the student who is unsure of himself 
in a subject may gain some sense of security and attainment if he 
finds that he has reached or exceeded the average successful per- 
formance of the large sample of students on the test. 

At the decision points of a college student's career—especially at 
the end of the sophomore and senior years—tests can guide him to 
his next step, perhaps to a choice of his major study, to a decision 
to try for a graduate fellowship, or even to the logical decision to 
discontinue academic education if his progress has been unpromis- 
ing. There are all kinds of tests available in a well-equipped coun- 
seling center that can help the student understand himself and 
help others understand him so that he may derive the optimum 
benefit from his higher educational experience. 


The remainder of Part I is devoted to a more detailed discussion 
of particular aspects of formulating and using a comprehensive test- 
ing program—admissions, placement, instruction, counseling, gen- 
eral institutional evaluation, and, finally, the organization and 
administration of the institutional testing program. 

In order to demonstrate the many approaches to the use of tests 
in improving higher education and their versatility in different 
situations and for different purposes, a small number of institu- 
tions were invited to tell the stories of their testing programs. 
These institutions were chosen as representative of various em- 
phases in testing and various degrees of complexity of organization. 


Their reports appear in Part Il. 


2. The Use of Tests in the Admission 
of Students to College 


NOT ALL INSTITUTIONS HAVE PROBLEMS OF STUDENT SELECTION, SINCE 
many are required by law or purpose to admit all applicants who 
are high school graduates—and adults who are not; such institu- 
tions can only strive to provide the array of curricula needed to fit 
a heterogeneous student body and perhaps later eliminate those 
who are not suited to the curricular offerings. While such institu- 
tions may well take advantage of preadmissions testing to improve 
student placement and guidance, these are not purposes related to 
selective admissions and so will not be treated here. 


FACTORS RELATING TO THE LOCAL 

SITUATION 
There are, however, 
limited, sometimes bec 
deemed 


frequen 


institutions where enrollment must be 


ause of the special kind of education 
appropriate for studen 


tly, by reason of w 


utset that “success” is here de- 


for the most common problem 
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faced in admissions is the identification of students who can be 
graduated under the prevailing academic standard. Beyond that, 
depending upon the size of its pool of applicants, an institution 
may further be able to secure such other representation in the 
make-up of the class as will guarantee leadership in various activ- 
ities considered important to the college community, a good rep- 
resentation of creative talents or special interests in various fields 
of endeavor, and the variety of ethnic, religious, economic, and 
geographic background that ensures the desired degree of cultural 
representation in the group. 

For most institutions, the definition of “success” adopted above 
is tenable; for others, not completely so. For, at the one extreme, 
there are those who strive to find not only students who can be 
graduated with the usual C average, but as many as possible who 
can be graduated both with grade averages well above minimum 
and with other attributes signifying a high level of intellectual 
and social development. At the other extreme, there are institutions 
with high student mortality rates who, because of local circum- 
stance, may consider themselves fortunate to secure students who 
have a good chance of remaining just through the early phase of 
the program; such students may be expected to profit from their 
briefer educational experience as much in terms of improved at- 
titudes or other personal development as in the kind of academic 
achievement which is commonly described by grade-point ratios. 
Between these extremes fall other variations of institutional ex- 
pectation for entering groups, related either to the educational 
philosophy of the institution or to local circumstances. 

Tests have repeatedly proven their usefulness as predictors of 
success in the most common academic areas—liberal arts, medi- 
cine, dentistry, engineering, law, and nursing. They are being 
developed rapidly, too, as predictors of academic success in some 
of the so-called occupational fields, again with the measure of 
success described either by grade-point ratio in these curricula 
or by achievement scores in later institution-wide testing, That 
they do not function at an even higher level of prediction results 
partly from the failure of colleges to identify clearly for test 
builders the kind of abilities in their applicants they want differ- 
entiated and described and partly from the failure of local methods 
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of evaluation to appraise educational objectives successfully. For 
these reasons, therefore, tests must be carefully selected in the first 
place to fit existing circumstances, and their efficiency must be 
judged in the final analysis only after the local effort to describe 
student achievement has been perfected. This means then that a 
prediction test which may be highly successful in one situation 
does not necessarily apply in another and that each institution 


needs to weigh local factors carefully in anticipation of setting up 
an effective admissions testing program. 


The selection of the tests 


n as a result of the population in- 
t an institution may not give suffi- 
Cteristics of its student population 
will be tempted to use tests and 


first place. It may be found tha 


purpose, but until it is ṣo demonstrated, such testing will remain 
suspect, 
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year performance is a good criterion of success in the remaining 
college years; therefore, it appears that if admissions criteria can be 
utilized that describe the skills needed in the first year they will 
prove useful on a long-term basis also. For this reason, prediction 
studies can, with some validity and convenience at the outset, be 
confined to first-year performance. 

If the institution is in no position to construct tests to measure 
the basic skills needed by its freshmen, it will need to examine 
published tests to see which incorporate the skills and background 
information considered requisite to the beginning studies of fresh- 
men. Such tests are described in the Mental Measurements Year- 
book The kind of tests required will vary greatly from one insti- 
tution to another. In one, freshmen may take work mainly in gen- 
eral education, for which readiness may best be described by 
achievement of basic background information in general studies 
and ability to read and analyze specific materials at a certain level 
of difficulty; in another institution, freshmen may need to demon- 
strate readiness for courses in chemistry and mathematics of a 
specific level. Or perhaps the same institution may have to identify 
both kinds of groups and therefore need to utilize a greater variety 
of admissions testing and criteria. The experiences of other col- 
leges of similar purpose who have already used the tests should be 
solicited. Research data should be sought in the literature or se- 
cured from the test publisher. If possible the tests should be given 
a trial locally with an entering freshman group prior to any final 
decision on their use. Often they can be incorporated in the 
orientation-week program of tests usually required of accepted 
candidates. A few years of trial, or less if the numbers provide a 
sufficiently large sample, and comparisons with the success crite- 
rion (usually grade-point ratios) will reveal their usefulness. 


Plans for utilizing data 


Statistical methods for analyzing test data for prediction pur- 
poses are provided in all textbooks of educational statistics. One of 


+ Oscar K. Buros (ed), Fourth Mental Measurements Yearbook. This book in- 
lete descriptions of tests than are found in publishers’ catalogues. 


cludes more comp ae . 
by one or more critics and references to published research 


Included are reviews 
on the tests. 
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the most clear-cut and detailed descriptions may. be pm Aa 
Statistics in Education and Psychology by Garrett.? A oe a 
discussion of the concept of prediction versus chance may E eee 
in a bulletin prepared by a test publisher.* The very a ul a 
tion can often derive great benefit from admissions data by u ne 
merely inspection techniques, simple analyses, or even w 

amounts to a case-study approach. 
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more formal statistical basis. Large 


vert data for IBM analysis can obt 
method from 


Collegiate Re 


i i i eed to con- 
r institutions who need to ; 
EA $ ä 

ain special information on tha 


. ssociation of 
a booklet prepared by the American Associatior 
gistrars and Admissions Officers.* 


The agency plan 


Few colleges are in a position to administer their ogtaner 
sions tests; the factor of geography alone is prohibitive. For i 
reason many college admissions testing programs rely heavily a 
the special Programs of such agencies as the College Entrance 
Examination Board or those of state-wide testing services or oe 
senior year testing of local high schools. The value of the bere 
of these agencies to an institution can be judged only by the rele- 
vance of the tests used to local need, the cost, the 
lation provided by the agent, and the 
applicants. The great majorit 
access to some such service. 
and locally administe 
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tests suitable for adm 
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red programs is that students will have had 
erience with the exact tests used, since some 

may have been used by high 
school guidance Counselors. When admissions tests are handled by 
the college itself, it should select those that are specifically re- 
stricted for college admissions use; participation in agency Pro 
estricted to those with adequate security controls. 
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THE USE OF TESTS IN SUPPLEMENTING AND 
CLARIFYING OTHER ADMISSIONS DATA 

Regardless of the content of the tests or the sources from 
which they originate, they must be viewed not as infallible criteria, 
but as partial criteria to be weighed with other evidence of the 
applicant’s fitness. These other elements may include the appli- 
cant’s high school grades, the pattern of the courses he has taken, 
recommendations from his school, the impression he makes in in- 
terview, perhaps samples of his writing, his ability to finance his 
education without undue strain, evidence of his incentive, and 
results of any other tests that he may already have taken. 


Conflicting data 

Most of the difficulties which an admissions staff experiences 
in utilizing the multiple criteria are related fundamentally to the 
problem of equating these criteria to each other, since they possess 
as many denominators as there are schools, raters, and test pub- 
lishers contributing data. Let us consider some of the causes of the 
discrepancies that will be found. 

Primarily, there is an uncontrollable variability in high school 
grading practices. But these variabilities do not necessarily arise 
from poor evaluation procedures. What the high school does for 
the kind of students it has and how it grades them may be quite 
valid, since not all high schools operate primarily as college-pre- 
paratory institutions. Some colleges draw most of their students 
from a few high schools whose purposes and standards they know 
well and may therefore be able to use some discrimination in eval- 
uating the grades given. Other colleges that draw their student 


bodies from a wider radius and so have more high schools to cope 
With are not in that informed position. Some colleges draw both 

*For analyzing grade differences among high schools, some colleges have found 
Useful an equating plan developed by the Commission on the Relation of Inde- 
Pendent Schools to Higher Education of the National Council of Independent 
Schools, The material is contained in Annual Reports of the National Registration 
Office for Independent Schools, and shows the kinds of grades obtained at par- 
ticular colleges by graduates with a given grade average from particular high 
Schools, These Annual Reports are issued to member schools of the council, Further 
information may be obtained from Marjory Etnyre, Secretary, National Registra- 
tion Office of the National Council of Independent Schools, Room 103, 5801 Ellis 


Ave., Chicago 37, IIl. 
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from private college-preparatory institutions of varying quality 
and from public college-preparatory and nonpreparatory high 
schools of varying quality. Obviously, many different grading prac- 
tices will be reflected among the applicants’ transcripts. 

The problem of controlling the factor of prejudice in personal 
ratings is also universally recognized. Sometimes a rating can be 
evaluated in terms of how well known to the admissions personnel 
the individual rater may be. The rating of an experienced admis- 
sions counselor can be wrong, too, if, for example, he has had to 
formulate an opinion when the applicant was accompanied to the 
interview bya domineering parent. 

The contribution to the total 
test data of the student’s hi 
port if the situation is clea 
skill in interpretation, sin 
of test data found on tr 
thorough appreciation for 
be able to use quite a col 
Unless his training 
rector flounders at tl 
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picture of all previously obtained 
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decisions are quite difficult and may revolve about one of several 
questions. 

For example, there may be an applicant from a high school 
unknown to the admissions office whose admissions test reports, 
previous test data supplied by the school, or the principal's rating 
conflict with the high school grade report; if the admissions test 
scores seem poor in the face of other data, retesting through a spe- 
cial arrangement with a high school officer may clear up the dis- 
crepancy. In retesting, it may be desirable to try the student on 
nonspeeded tests, if speed of work is not an important criterion of 
Success at the particular college, or to arrange for a selection of 
tests more suitable to the individual situation than those repre- 
sented in the regular admissions battery. 

Another difficulty arises when there is a surplus of seemingly 
equally well-qualified applicants, a situation encountered usually 
in choosing among applicants for scholarships. Needless to say, 
the more definitive the test data that can be secured, the easier the 
selection of final candidates. For this reason many institutions re- 
quire that scholarship applicants take tests that are additional to 
the regularly required battery. , 

There are usually a few cases where a personal rating of one in- 
dividual rater raises doubt about a candidate whose qualifications 
may otherwise seem excellent. A good high school record supported 
by acceptable admissions test data has undoubtedly saved more 
than one prospective candidate from the ire of a rater, who may for 
unjust reasons have attacked the candidate’s fitness. Rechecks of 
such ratings through special correspondence are in order. 


Special cases in applicant evaluation 


Finally, supplementary tests can be of help in handling those 


Special applicants whose background may be completely alien to 


that covered by the admissions tests used, yet whose preparation to 
undertake advanced studies may be entirely adequate. After all, 


tests designed for general admissions uses must be tailored to fit 


the common preparation that most students do receive and they 


are standardized for such groups; therefore, they should not be 
©Xpected to serve as well for the qualified applicant who has re- 


fa i 
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ceived his education in what might be termed an unorthodox 
fashion. 

For example, there may be precocious non-high-school graduates 
who can demonstrate that they are well enough equipped to un- 
dertake the program; similarly, there may be a few applicants, 
part of whose preparatory work was in business subjects, who can 
demonstrate achievement of the basic skills and interests that com- 
pensate for the few units they lack; there are adults who may have 
an advantage in preparation and others with a discrepancy result- 
ing from delays in their formal education; and there are inter- 
national students who need to determine their readiness for an 
American education and the level at which they should enter. 

Another group of special applicants that can be better handled 
with recourse to tests are the transfer cases. Some institutions have 
but few of these, but there are many others, particularly those fed 
by neighboring junior colleges, who accept a great many transfer 
students, and sometimes have even more transfer students in their 
upper classes than they have students who entered as freshmen. 


Such institutions can, through the use of some such battery as the 


National College Sophomore Testing Program, the Graduate Rec- 
ord Area Examinations, or other tests they consider to be more 
appropriate to their own basic curriculum requirements, make a 
reliable selection among the applicants. 

Occasionally an institution may wish to consider a doubtful 
transfer applicant, say, one who has not done well elsewhere in 
terms of grade average but who, for reasons that need not be de- 
veloped here, deserves a second chance. Take, for example, the 
transfer student who brings an F or an Incomplete in two courses, 
which he received for not submitting, because of an accident, the 
final course term papers. While, at most institutions, he probably 
could not be allowed credit for these courses, he might through 
examination demonstrate his competence in the basic skills and 


content encompassed and so be allowed to utilize them to fulfill 
college prerequisite requirements, 
These several descri 


been developed to de 
certain dilemmas that 
it might be pointed o 


ptions of solutions to problem cases have 
monstrate the general usefulness of tests in 
arise in unusual instances. In all these cases, 
ut, not only has the institution gained, but 
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the applicant also has usually improved his ability to accept what- 
ever decision is made and undertakes with greater confidence what 
is for him a new experience in adjustment. 


THE ADMINISTRATION OF THE ADMISSIONS 
TESTING PROGRAM 

A college that decides to incorporate tests in its admissions pro- 
gram must design administrative procedures to support their best 
functioning. If admissions tests are required, that fact should be 
mentioned in the college catalogue; potential applicants should be 
informed as early and as clearly as is possible about the nature of 
the tests, when and where they may be taken, what they will cost, 
and what bearing they will have on the final decision to be made. 

Provisions must be made for the integration of the admissions 
test data with the other data about the applicant. Usually, all are 
assembled simply in one folder along with all correspondence with 
the student, his high school, and his parents. If resources permit, 
it is advisable that one qualified person summarize all test data of 
an applicant on one report sheet, so that they will not be scattered 
throughout the folder and so that they receive closer consideration 
than time permits in the usual admissions committee meeting. 
Most of the admissions test data will subsequently be useful in the 
general guidance of accepted candidates; thus, provisions should 
be made for transferring the roster of test results to appropriate 
college offices and faculty advisers for permanent retention. 

If the college receives and attempts to utilize test data from 
several or more sources—say, the various College Entrance Exami- 
nation Board programs, the Independent High School Testing 
Program of the Educational Records Bureau, the Differential Apti- 
tude Tests data supplied by a high school, and, perhaps, the data 
obtained in one of the several state-wide programs—it is advisable 
to assemble in advance whatever information is needed to inter- 
pret the normative data supplied for each test and if possible to 
attempt in advance some rough equating of such data to data that 
may already exist for students currently enrolled in the institution. 
Each institution can in time build up its own norms so that it need 
not be dependent on the shifting bases of national norms or the 
special bases of norms for particular kinds of students. 
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In advance of final decisions, the admissions personnel can rank 
students roughly into several or more categories of acceptability 
in line with principles previously established by the governing 
group. 

Tests must not be viewed as the ready-made, flawless answer to 
admissions puzzles; they supply substantial clues in each situation, 
but their application has limitations not always readily apparent 
to the uninitiated. Tests are more likely to fail because of misuse 
by those who hoped for a short cut in analysis than because of 
shortcomings not made explicit in the data provided by the pub- 
lisher. Therefore, if the admissions staff does not include a quali- 
fied test specialist, there should be easy access to the institution’s 
test officer, or, if there is none, to a qualified consultant who can 
lend occasional help. 

Once effective procedures for handling admissions test data are 
developed, the procedures should not be allowed to become me- 
chanical or routine. The three important elements of the process 
—the program of the institution, the caliber of the applicant group 
in general, and the success criterion of the institution—can and 
probably will change with time. There is thus good reason to 
reconsider the elements periodically and every reason to subject 
all admissions practices, including the tests and test procedures 
utilized, to reappraisal. 


FURTHER READING 

There are other discussions on the subject of admissions testing 
that the interested reader will want to examine. For several years 
the College Entrance Examination Board has published the papers 
delivered at its annual colloquia on college admissions problems.*° 
The papers present a multitude of considerations and procedures 
of general interest to admissions personnel and are well worth 
scanning for those of definite local application. A discussion by 
Chauncey and Frederiksen in Educational Measurement’ elabo- 


° College Admissions 1, 2, 3, and 4. College Entrance Examination Board, Box 592, 
Princeton, N.J. 


* Henry Chauncey and Norman Frederiksen, “The Function of Measurement in 


Educational Placement,” in E. F. Lindquist (ed.), Educational Measurement, 
pp. 85 ff. 
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rates many of the points that have had but brief mention here; in 
particular, their discussion of the reliability of high school rank 
and their discussion of admissions to special college programs 
within the university are illuminating.’A simple beginning to the 
Statistics of prediction has been given by Cronbach.§ The works 
mentioned contain bibliographies. Persons responsible for the for- 
mulation and execution of admissions policies can keep informed 
on relevant issues, reported in the College Board Review.° 

“Lee J. Cronbach, “Treatment of Data in Prediction Studies” and “Principles 
of Prediction,” Essentials of Psychological Testing, pp. 247 ff. 


° College Board Review. Published three times a year by the College Entrance 
Examination Board; subscription offices, P.O. Box 592, Princeton, N.J. 


3. The Use of Tests in Course 
Placement or Accreditation 


AS USEFUL AS THE ADMISSIONS AND GUIDANCE PROGRAMS OF A COL- 
lege may be in helping generally to promote optimum learning 
and to reduce student failure, the range of abilities among stu- 
dents in a particular class or program of study may still be wide. 
It is very difficult to bridge the gaps in learning that occur with 
students whose abilities, preparation for a course, and interests 
vary greatly. Sometimes greater individualization of instruction is 
achieved by establishing numerous small sections; perhaps this is 
the ideal solution, but it is one that is not practic 
leges. Therefore, many institutions have instead 
of placing students in course sections repres 
grees of attainment so th 


appropriate to the capaci 


able for most col- 
adopted a policy 
enting different de- 
at instruction may proceed at the level 
ty or readiness of the group. 


STEPS IN THE PLACEMENT PROCEDURE 

In almost all colleges there are certain courses w 
all, if not all, freshmen and sophomores are required to take be- 
cause their content and purposes are considered essential. In some, 
the purpose is to provide important and basic general background 
which students, if left to their own initiative of electing courses, 
might not obtain; in others, the major purpose is to develop skills 
and acquire knowledge needed for subsequent courses. Regardless 
of the reason, the result is that at Most institutions more students 
are enrolled in certain courses of the first two years than can readily 


be accommodated in single instructional units. Thus, there is a 


need to establish some basis for sectioning students into smaller 
units. 


hich practically 


If it is decided that 
basis of homogeneity, 
ground required for s 


placement in a course is to be made on the 

then an analysis of the abilities and back- 

uccess in the course is necessary. What gen- 
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eral skills and how much specific knowledge are necessary to en- 
able the student to make best progress from his current level? 
Theoretically, where selective admissions criteria are employed 
and if the criteria have been good, any student should be able to 
complete any required course. But the issue in point is to have 
him do so with maximum opportunity for progress and without 
interference to or from fellow students. It is not always necessary 
that each class unit be handled in exactly the same way, even 
though the general course objective may be identical for all; both 
content and method can be varied depending upon the readiness 
of students. A course in first-year chemistry for premedical stu- 
dents may be quite different from a course in beginning chemistry 
for the student of agronomy; a course in the development of social 
institutions can be taught largely by a discussion method to stu- 
dents of certain reading sophistication and some demonstrated 
knowledge of the subject, with greater enrichment for them, but 
this may not be the best approach for students of lesser back- 
ground, alertness, and interest in the field. 

A first step in the placement procedure, then, is to determine 
what talents a student should possess, or just what he should be 
able to do, to derive optimum benefit from the course as it will 
be arranged. Arrangement, of course, is contingent upon local re- 
sources for making optimum provisions. Once the expected prep- 
aration or abilities have been defined, attention can be directed 
to selecting or constructing tests suitable for establishing their at- 
tainment. Frequently, and more often than is commonly assumed, 
some of the tests used in an entrance program may be further used 
successfully for purposes of placement and acceleration. Some- 
times, other data, such as high school grades in related subjects, can 
supplement test data. 

The identification of criteria for placement is first established 
on an a priori basis with time eventually dictating their usefulness. 
They may prove ineffective because of poor initial choice, poor 
instruction, or poor methods of evaluating course success; all three 
factors then must be considered in reappraising the placement 
criteria. 

Fortunately, there are a number of tests available that are quite 
suitable for placement purposes at the college freshman level; 
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these identify general aptitudes, skills, and background achieve- 
ment. The number of units of high school study and the quality 
of achievement as indicated by the high school grade (where the 
standard of the high school grade is known) also contribute to a 
more accurate understanding of the individual student’s readiness. 
Locally constructed examinations, built up from past examination 
materials of the course will add another dimension to the evalua- 
tive data. In fact, it is probably quite safe to assert that as far as 
placement in course is concerned, there are much more data avail- 
able for use than many an institution is able to utilize. 


WAIVING REQUIRED COURSES 


Frequently it may be desirable to waive required courses if stu- 
dents have already mastered the skills and learning which the 
courses are intended to develop. When this is done, some basis 
other than course grades is needed to place the exempted students 


in an intermediate or advanced course. In some instances, the final 
examination for Course A 


student has adequately mas- 
to a subsequent course. Cer- 
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had the required special training in the prerequisite course. Here 
aptitude alone may be a sufficient basis for waiving a prerequisite. 
However, entrance into the subsequent course should be based 
on aptitude alone only if there is adequate evidence that students 
with a high level of general aptitude can actually perform success- 
fully in the course without previously having acquired the pre- 
requisite knowledge or skills. 


PLACEMENT IN SPECIFIC COURSES 
Placement in freshman English 

Since freshman English is the most generally required subject 
in college, placement activity in this area is probably greater than 
in any other college course. 

A college is seldom able to establish more than three freshman 
English groups: superior, average, and below average. The criteria 
ordinarily employed are: (1) senior English grades from high 
Schools whose grading standards are known, since they give at 
little cost a relatively dependable picture of the student’s most 
recent performance; (2) a measure of scholastic aptitude, probably 
available from entrance test data; this may be a total score, or, if 
the test has a verbal and a quantitative section, only the verbal 
score, since it indicates the extent to which the student under- 
stands and is able to deal with verbal relationships; and (3) scores 
from an English test administered for admission or as part of the 
freshman guidance test program. When such scores are not avail- 
able from some other institutional testing program, the depart- 
ment may administer a local or published test on the fundamentals 
of English, or one aimed more directly at writing ability, depend- 
Ing upon the objectives of the course and what is expected of the 


beginning student. 


Placement in foreign languages 

Foreign languages also rank high, insofar as placement or the 
Waiving of courses is concerned, because here success in subsequent 
Courses also depends directly upon skills built in earlier courses. 
If students come from a variety of secondary schools with varying 
Quality of foreign language instruction, some standard basis for 
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assessing the level at which they should continue study is most 
desirable. 

Examinations of proficiency in vocabulary, grammar, and read- 
ing and aural comprehension will readily supply the information 
needed for assigning students to a beginning course or to a course 
aimed at developing basic skills at a somewhat higher level, or to 
a conversation course, or to one of the several literature courses. 
Cutting points for determining these various assignments are best 
fixed by first using the tests in the courses with regularly enrolled 
students as their final examination and determining the correla- 
tion between their proficiency and the test results. 


Placement in other courses 


The problem of placement in mathematics resembles that in 
foreign languages since most colleges offer a uniform sequence of 
courses with progress in subsequent courses dependent upon suc- 
cess in preceding ones. Consequently, it is relatively easy to select 
or to develop a test which covers the material appropriate for one 
or more of these courses. 

In the natural sciences the problem is more complex. In many 
institutions, there is a variety of science offerings, with one course 
for the student with no background, another for the nonmajor 
who has had a high school course, still another for the pre-engineer, 
and a fourth for the potential major. Furthermore, for certain of 
these, specific mathematical skills may be required. In more ad- 
vanced chemistry courses, there may be prerequisites from other 
fields of science. In a situation as complex as this 
tant to analyze carefully the intended v 
and the extent to which the subsequent 
knowledge and skills developed in the pr 
the requirement of a prerequisite high s 
for students with high scientific aptitu 
true, the college course really starts all 


a much faster pace for those who have had previous work. In many 
Science courses, quantitative aptitude is likely to be important, but 
seldom to the exclusion of verbal aptitude. Specific tests of scien- 
tific aptitude are available or can be constructed. A combination 
of these, selected empirically and utilized along w 
thorough content analysis, solves many pl 
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more advanced science courses, the work is much more likely to 
begin at a specific level, since there is certainty that all students 
will have reached this point. Here tests of achievement are likely 
to be much more effective for placement or screening than are tests 
of aptitude. 

Placement in the social sciences and the humanities is similar 
in some respects but different from that discussed so far. These 
disciplines are much less likely to have a standard sequence of 
courses, each a prerequisite for the one following. Individual 
courses are much more likely to be self-contained. Course pre- 
requisites more often serve to ensure general familiarity with the 
vocabulary, tools, and concepts of the subject. If this is the case, 
an aptitude test and a general test covering vocabulary, tools, and 
concepts may be appropriate for determining admission to ad- 
vanced courses. 

MEETING DEGREE AND HONORS REQUIREMENTS 
THROUGH EXAMINATIONS 

At times it may be appropriate to grant a student credit in a 
course without his taking it if he demonstrates a particular level of 
competency. Or it may be desirable to know that each student has 
achieved a certain minimum level of general education before he 
is admitted to the upper division of the college, or to have him 
demonstrate a high level of achievement before granting him 
honors. Each of these situations poses its own problems, different 
from those already discussed, though there are some similarities 
in procedure if tests are used as a major criterion. Let us, then, 
review them in at least a general way. 

The granting of course credit on the basis of examination alone 
has not been widely practiced, although it may develop momen- 
tum. This lag has been partly due to a definite rejection of the idea 
by college faculties and partly to the fact that so few students have 
sought the opportunity. However, there is a growing emphasis in 
some high schools on providing enriched or college-level courses 
for superior students,’ who may in turn wish to accelerate their 


*See College Fntrance Examination Board, Advanced Placement Program and 
Charles R. Keller, “Piercing the Sheepskin Curtain,” in College Board Review, 
Fall 1956, p. 1. 

The report of the Ford Foundation, They Went to College Early, Evaluation 
Report No. 2, on the Program for Early Admission to College will also be of 
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college work and will apply for admission to those colleges that 
offer that opportunity. Even though the proportion of students in 
this category remains relatively low, colleges hoping to provide 
diversified programs for expanding student enrollments may ex- 
plore the feasibility of granting credit on the basis of adequate 
performance on a test designed for the purpose. 

The superior entering student is not the only kind of student 
who may seek credit by examination. There are other situations 
in which either the student, the college, or both may consider the 
possibility of an accreditation by means of examination. Sometimes, 
students, because of illness or financial reverses, find it necessary 
to leave college before the completion of a semester or a year of 
course work; when they are able to return, they may wish to ad- 
vance with the students in their class or at least to have an oppor- 
tunity to make up some of the credits they have lost in the areas in 
which they feel most confident. Transfer students who believe that 
they have already covered certain prerequisites in another se- 
quence of courses may wish to receive credit toward their degree 
without taking (and possibly repeating) the sequence required in 
the new college. Or the college itself, uncertain of how to appraise 
the past work of transfer students in relation to particular required 
courses, may need to have students demonstrate the extent to 
which they have mastered major outcomes before granting credit. 

Whatever the reasons for setting up a program of credit by ex- 
amination, the major problem facing the college will be that of 
ascertaining whether given students have achieved a level of com- 
petency equivalent to that achieved b 
course or courses in question. This 
of waiving prerequisites without c 
in the extent to which the outco 


y those who have taken the 
problem is quite similar to that 
redit. The major difference lies 
mes of the course must be meas- 


develop, must be measured, Assu 


who receive credit by examination are as competent as those who 


interest to those who are considering the early admission of superi i 

r perior candidates. 
Available from the Fund for the Advancement of Education, 477 Madison Ave. 
New York 21, N.Y, = 
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have taken the course. Thus, an examination designed for waiving 
the course will not necessarily satisfy the requirements for granting 
credit; a separate and somewhat more comprehensive examination 
is needed. Also, some outcomes may be demonstrated only when 
certain projects, such as written reports, are completed. This could 
be made a requirement additional to examination performance. 

As in placement programs, the first step should be a thorough 
analysis of the content of the course in question. When the analy- 
sis has been systematically made, it is then possible to review ex- 
isting tests to determine the extent to which they correspond to 
the goals of the course. A natural temptation will be to use an exist- 
ing course final examination as a basis for awarding credit. If the 
final examination is already the sole basis for awarding grade and 
credit in the course, it would be difficult to make a case against 
using it without at the same time requiring some revision in the 
present course. 

However, the final examination is not usually the sole basis for 
awarding credit. There may be other examinations during the 
course, term papers, projects, or other assignments, and classroom 
discussions, all of which may serve as partial bases for evaluating 
the student's performance. If evaluation based on these is believed 
to be essential in determining whether or not the objectives of the 
course have been attained, then a way must be found to incorpo- 
rate into the examination for credit some measure of their ac- 
complishment. It may well be, in some cases, that an examination 
cannot stand in place of the course itself because too much of the 
course depends upon participation in the group growth and de- 
velopment and the general maturation that comes with systematic 
exposure to a body of ideas and knowledge. It also may be believed 
that there are intangibles that are acquired by exposure during 
the course and that cannot be measured by an examination. 

This last answer should not be accepted too readily, however; 
the intangibles may not be measurable because they are not there 
to begin with. Most instructors believe that their own courses 
actually accomplish much more than meets the eye. If the asser- 
tion is made that there are things that cannot be measured, then 
it is obviously impossible to prove or to disprove successfully the 
assertion that they are being acquired. The question, then, should 
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not be whether every possible benefit from the course is subject to 
testing but whether an examination can be obtained or con- 
structed that evaluates the broad, important outcomes expected 
from the course. Usually this can be done; but frequently it will 
be different from the regular final examination given in the course. 
It may be that a published test or a test prepared by another 
college offering a similar course will be appropriate. Just as often, 
it may be necessary to construct a special test which will reflect 
the broad instructional outcomes of the course. 

A problem almost as crucial as the determination of the nature 
of the examination is the determination of the score level required 
for the granting of credit. A range of scores will be obtained in 
practice, and some cutting point must be determined so that scores 
above this level will establish credit and scores below will not. As 
mentioned earlier, for many courses there will be relatively few 
students requesting the Opportunity to gain credit by examination. 
This means that the data obtained from actual use of the test will 
remain quite meager for some time after its introduction—a major 
advantage of using an already developed test on which some norms 
are available. The latter might be either a published test with 
so-called national norms, or a test prepared for similar courses in 
another college or university. In either case, there is likely to be 
some indication of how a known group of students has performed 
on it. 

One could, of course, on a subjective basis alone, specify the 
level of achievement a student must reach to receive credit for the 
course. Although this approach has merit, in practice it has certain 
shortcomings. The desired level of accomplishment might be set 
at a level higher than that actually achieved by even the best stu- 
dents who take the course. It is, therefore, wise to stu 
results to see whether students t 
Specified level of competency. 


dy the test 
aking the course do perform at the 


If there are a number of different expected outcomes for the 
course, and if they are not necessarily closely related, it is reason- 
able to require the student to achieve a passing score in each. This 
will ensure that he has covered the broad objectives of the course 
and not passed simply on the basis of high proficiency in one nar- 
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row aspect. If the decision is made to use multiple cutting scores in 
this fashion, one should be sure that the individual subtests on 
which the separate scores are based are sufficiently dependable to 
use as a basis for withholding or granting credit. 


COMPREHENSIVE EXAMINATIONS IN PLACEMENT 


In some institutions fulfillment of the objectives of basic courses 
at minimum levels only may be deemed an insufficient attainment 
to warrant the student's entering upon advanced study in the area 
or entering upon a major study in that area, and examinations can 
be used in reaching the decision. The point must be made, how- 
ever, that it is not the suggestion here that examinations alone be 
used without recourse to other criteria such as teacher recommen- 
dations. 

If a college contemplates such a 
examinations experimentally, keepi 
points must be set realistically to be consonant with the institu- 


tion’s objectives. Again, examinations may be selected from pub- 
tructed to suit local needs; also, in 
as to whether the content of 


move, it will do well to try its 
ng in view that the screening 


lished examinations or be cons 
either case, a decision should be made 
the tests shall be general or specific in nature. 

The institution that decides to use examinations on a “general” 
basis may administer, for example, some such general battery as 
Sophomore Tests, the Graduate Record Area 


the National College 
Examination, or the Sequential Tests of Educational Progress. 


Adopting a basis that, in the experience of the institution, has 
some validity, it may be found that few, if any, students below a 
certain score on any one of these examinations or its separate parts 
are good risks as advanced students, and they should, therefore, not 
be encouraged to continue. Or an alternative may be permitted: 
they may be conditioned, that is, allowed to continue with the 
understanding that they will be expected to do additional work 
to remedy the deficiency disclosed. For example, ifa student passes 

mathematics and if the college considers 


in all subjects except 
proficiency in mathematics to be essential to the degree 1t wishes 


to underwrite, then the student may be required to satisfy the 
examination in mathematics sometime prior to graduation. In any 
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event, the faculty must first determine it 


S own standards of per- 
formance, and it follows that these shoul 


d be realistic for the stu- 


general requirement rather 
for a particular department. The 


acher education training program. Here, 
criteria of ability and personality would 
nsidered, tests in English, speech, funda- 
nd general culture could be used and it 


examinations listed abov 


tion, it is Probable tha 
should be added. 


4. The Use of Tests in Instruction 


A GREAT DEAL OF THE INFORMATION THAT IS COLLECTED IN THE 
process of testing for admissions and placement purposes also finds 
uses in instruction. Available even prior to instruction, general 
information about a particular set of students describing their scho- 
lastic aptitude, their previous achievement in the field, their read- 
ing level, their English skills, and so on can assist an instructor in 
gauging the general level at which he can teach a class and iden- 
tifiy those who are deficient or advanced. Even though his circum- 
stances may not be such that he can provide ideally for each group, 
he at least possesses a more informed understanding of them. The 
economics instructor should be alerted to the fact that most of 
those enrolled in his course are superior students excused from a 
required beginning course and, at the other extreme, an English 
instructor with a group whose achievement in English is pre- 
dominantly low should be aware of his problem from the outset. 
The extent to which members of an instructional staff can receive 
initial cueing will be determined, of course, by the quality and 
extent of local testing programs and by the procedures for channel- 


ing data to them. 


PURPOSES OF TESTS IN INSTRUCTION 
Tests serve in instruction to clarify goals, to determine the 
initial status of students, to appraise student growth throughout 
the course, to appraise instructional materials and methods, and 
to stimulate learning. Each of these purposes will be treated here 


in some detail. 


To clarify the goals of instruction 

Instructors teach to increase their students’ appreciation of lit- 
erature, or their understanding of physical laws, or their ability 
to think straight, or to help them attain equally worthy goals. 
Ideally, the content and organization of instruction should, in all 
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be derived from its ultimate goals, and evidence should be 


t they do operate to achieve them. 
ship is ignored, and the instructor 
act and often vague statement of ultimate 


goals and selects content and method on a basis of tradition or 


opinion only. 
If there were non 


€cessity to appraise student progress, goals 
could easily remain at 


an abstract level. Fortunately, the examina- 
tion of student attainment of basic instructional goals is a fairly 
universal requirement in higher education, and in the process, 


goals of instruction must be translated into the more specific be- 


dents demonstrate by doin 


& thinking, or reacting 
in some manner, 


rse where one of the goals of instruc- 
itical thinking, students are expected 
rtain material and to criticize it from certain 
xercise their attention may 
icit their criti- 


questioning induces other 
y of the material; finally, a satis- 
Summarizes what has 
more Penetrating analysis. 

erally pleased with the way 
in which the students have analyzed tl 


Ne problem, even though all 
students did not participate, how can he 


been said and makes an additional, 


modified for a fresh trial. 

This example illustrates the i 
ining a goal—namely, defining what it is, 
exercises, identifying the student behavior which demo 
attainment, and ending with appropriate evaluation, By 
tice it is rarely possible to proceed so smoothly, Rathe 


Nstrate its 
it in prac- 
T, a goal is 
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stated and then, in selecting appropriate teaching material, it may 
be found that the goal needs further clarification. Or, in preparing 
the evaluation, it may be recognized that certain of the assign- 
ments are unsuited to the development of some of the specific 
skills. Or, in reviewing student responses to the problem presented 
by the test question, it may be found that the goal is unrealistic 
in terms of what students are able to accomplish, Thus, a constant 
interaction will occur among the various processes which eventu- 
ally leads to modifications, and a clearer statement of the aims, 
content, and methods of the course should emerge. In this frame 
the purposes of achievement testing are much more fundamental 


in instruction than is indicated in the more commonly stated 


purpose of grading students." 


To determine initial status 

While college courses of instruction universally assume the ex- 
istence of certain initial background on the part of students, it is 
helpful to know that all students really possess this background. 

Pretests, either locally constructed or of the standardized variety, 
can be of great use in ascertaining this. In addition to information 
about the class level and about individuals in the class that can be 
gleaned from any general test programs of the college, the instruc- 
tor may use special tests to obtain information on the more spe- 
cific knowledge, skills, and abilities of individual students in 
relation to his subject, so that he may determine his most suitable 
course of action. 

There are several possible alternatives which might follow upon 
measurement of initial status. One is that students with insufficient 
background will be excluded from the course, for it is not always 
possible to bridge their deficiencies; the physics instructor cannot 
stop to teach algebra to students who have not mastered the mini- 
mum essentials nor can the literature instructor wait for the 
mastery of reading skill. But even if obviously incompetent stu- 
dents are eliminated, considerable differences among the remain- 
ing students may still exist. The particular starting point and the 
particular sequence of a course should be, so far as possible, 


his process, see Ralph W. Tyler, “Measurement 


1For a further development of t! i; 
F. Lindquist (ed.), Educational Measurement, 


in Improving Instruction,” in E. 
pp. 49 ff. 
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adapted to their status. It is well then for the literature instructor 
to know how much emphasis he will need to give to classical allu- 
sions and for the Spanish instructor to know what Spanish vocabu- 
lary has been mastered by his beginning second-year class. Some 
students will be found to be minimally prepared; others may al- 
ready have mastered minimum goals. If the latter are to be profit- 
ably engaged, either they must be given Opportunity to pursue 


additional work within the framework of the course or they must 
be placed in more advanced courses. 


To measure terminal status and growth 


How much knowled 
and to what extent th 


achievement 


test to establish some base line for 

Also, there are some rather basic 
taining measures of progress on individual students, 
students who start rather low on the achievemen 
at the end of the course, still appear relatively low When compared 
with other students. Some of these students may actually have 
learned a great deal more than is revealed by their final grades, 
which normally reflect only their rank on some “absolute” 


especially for 
t ladder ang who, 


scale 
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or in comparison with other students. Growth measures may help 
to show these students that they actually have benefited from the 
course and will thus encourage them. They also provide data 
which aids the instructor in planning appropriate educational 
experiences in his course. 

Although there are advantages in charting student progress, 
there are certain technical problems which complicate the inter- 
pretation of scores. It often happens that students with the lowest 
scores in an initial test show relatively greater gains than students 
whose initial scores, while low, were not quite so low and that 
students who originally earned very high scores earn somewhat 
lower scores at a second testing, even after instruction. At first 
glance, it may seem that the poorest students progressed most, 
while those who were best performers at the outset made no prog- 
Tess, in fact, regressed from original achievement. Logic would 
question this, and it should, for, indeed, this is not the interpreta- 
tion that is justified, regardless of appearances. What is often being 
demonstrated by such score patterns is simply a phenomenon 
known as “regression toward the mean,” which stems largely from 
inaccuracy of measurement. Tests never provide more than a sam- 
pling of all the tasks that an individual could be asked to perform. 
He might perform somewhat differently on a second, though quite 
similar, set of tasks and earn a quite different score. He might 
undertake the same or a similar test with a quite different attitude 
or motive, or any number of other factors—all beyond our control 
—might operate to affect the final result. At best, then, a test score 
is only an estimate of a person's true achievement. 

For any individual, scores from separate testings may vary in 
either direction. When these initial scores are quite different from 
the score which represents the average performance of a group, 
there is greater likelihood that a second score will move in the 
direction of this average rather than away from it; second scores 
are often somewhat higher than original very low scores and some- 
what lower than original very high scores, due to chance varia- 
tions. This statistical phenomenon will be found discussed in any 
textbook on statistics.” 


Test Score?” Test Service Bulletin, No. 50, June 


?See also “How Accurate Is a 3 
1956. Available from the Psychological Corporation, 522 Fifth Ave., New York, N.Y. 
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Suffice it to say that such measures can be used as a rough guide 
to development, provided score 
warrant the assumption that the 
alone and provided the tests or 


differences are large enough to 
y are due to more than chance 
evaluation devices are inclusive 
and comprehensive enough to cover what has been taught and 
experienced in the course. If the course goals have been clearly 
defined in terms of student behavior, and if the evaluation tasks 
have been carefully selected to reflect the content and emphases 
Of the course, rather than the general objectives which could be 


Precision of measurement is not as refined as might be desired. 


To evaluate teaching materials and procedures 


It frequently happens that the teacher whose course is well 
planned and organized assumes that his materi 
are satisfactory and that any | 
dents results from student ine 


struction is at fault. With the 
use of suitable tests or assignments he can determine whether cer- 


tain approaches have induced good learning experiences. If learn- 


new way to accomplish his 
e instructor wil] constantly 
ectiveness of his instruction 


ght of what he finds; in this 
of evidence, 


arning of students, The 
largely by the examina- 


mation prerequisite to critical thinking 


will, in their future study, focus on learning the information they 
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will need for such answers. If, on the other hand, test questions 
are framed so that the student is required to organize and relate 
information in a manner requiring his own independent critical 
thinking, there is much greater likelihood that he will, in his 
study, practice the critical analysis of materials. 

IE tests are to serve as guides to study, it is necessary that they 
be used periodically throughout the course. Some instructors have 
found it helpful to provide practice test questions for students, 
particularly if the form of test varies much from the forms the 
students are acquainted with. Usually some discussion takes place 
in advance of the examination, designed to help the student in his 
preparation. The variation of examination form to prevent over- 
emphasis on one narrow type of preparation is sometimes effective 
in forcing students to consider material from different angles. 

When students are given an opportunity to test their learning 
by responding to test questions or other evaluative devices, when 
they learn whether or not their responses are correct, and when 
they have an opportunity to correct misinformation, there is 

an when they lack these opportuni- 


greater advance in learning th 
ties. There is much to be said for the discussion and analysis of the 


examination answers in a later class meeting, and, as a by-product, 
the instructor also has an opportunity to learn why students an- 
swered questions ina particular way. This often gives him insight 
into the good points or shortcomings of some of his methods or 
highlights some of the weaknesses of the test questions and leads 
to the improvement of learning exercises and test questions in the 


future. 

Returning tests and discussing test questions in class poses a 
problem for the instructor who maintains a test file for future use, 
for copies of the used questions may fall into the hands of future 
students and threaten test security. One way to circumvent this 
difficulty is to maintain two files of alternate test questions de- 


signed to measure the same skills and abilities. The second file 
would serve the class discussion purpose. Another way is to dis- 
cuss the test in general in class without the actual return of papers: 
after all, the student will recall his errors and from the point of 
view that any test is a mere sample of a larger body of content, 
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THE ADVANTAGES AND DISADVANTAGES 
OF STANDARDIZED TESTS 
It is quite obvious that the content of published tests may not 
be suited to measuring attainment of all the objectives that a pat- 
ticular instructor envisages for his class. Yet, such tests may possess 


ch teacher-made tests do not 


d Sts designed for use in 
specific college courses, Except for tests in sı 
lish, in mathematics, in chemistry, 


nomics, 
sociology, American literature, English, United Sta 


logic, history of philosophy, educational general psy- 
chology, Shakespeare, Chaucer, and many others. It ; 
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consuming task for the instructor to prepare a sound classroom 
test; if a basic test were available, he could devote his energies to 
the preparation of the supplemental tests needed for special aspects 
of his course. 


THE DEVELOPMENT OF LOCAL EXAMINATIONS 


Since it is most unlikely that he will find entirely suitable tests 
elsewhere, the instructor must eventually draft his own quizzes, 
tests, and examinations. If his materials and assignments have been 
selected in relation to well thought-out goals, including a precise 
statement of changes expected in student behavior, much of the 
basic material for his task is already available. The problem then 
becomes one of organizing this material in such a manner that 
good questions will be produced. A good test plan will indicate 
in detail what it is that the test will measure and will try to formu- 
late answers to such questions as: What are the objectives being 
sought? How does the student show that he has achieved these 
instructional goals? Which of these behaviors can be measured by 
the indirect method that the written test situation imposes? To 
what extent can the test exercises be made to approximate the 
actual behavior? Is it feasible to substitute test demonstrations for 
paper-and-pencil tests? (This is seldom practicable, except on an 
occasional basis to determine the extent to which the written 
test is a proper OT improper substitute for the true demonstration 
of achievement.) Should the test measure the extent to which a 
limited number of objectives or a certain unit of the course has 
been mastered, or will it attempt to cover more of the course? 
Will the test focus primarily on ability to reproduce content which 
has been covered specifically, or will it attempt to measure the 
ability of the student to apply what has been taught to new ma- 


terials? 


It is rarely possible to build a test without altering its original 


plan from time to time. For example, in identifying concrete situa- 
tions to demonstrate the attainment of a particular goal, the plan 
may begin to show insufficient detail or a question will emerge in 
viously suitable, but it may be found 
lassifying it. It may then be realized 
tegory has been overlooked in the 


writing which is good and o 
that no category exists for € 
that a quite valid goal or ca 
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listing, and it will consequently be added. Thus, continual IRIGIS 
action of the various steps in the process leads to a general clarifica- 
tion of instructional goals. For this reason the examination which 
serves the course best is that which h 


as been planned, although not 
necessarily written 


» in advance of teaching, rather than one which 
is written two days before the typist sets a deadline for receipt 
of copy. A plan will rightly focus atte 
goals to be covered, and the time s 
of the best guarantees of deve 
course and a better test. 

An early step in the test plan is to provide 
both adequate and in balance with the 
instructional goals. Some teacher-se 
experiences, as well as those w 


ntion on the instructional 
pent in spelling them out is one 

3 $ ici ‘tter 
loping simultaneously both a bette 


for coverage that is 
emphasis given to e 
lected body of materials anc 


hich the student elects, may have 
been utilized in the attainment of a particular objective. If the 


examination is to have “face validity,” that is, incorporate material 
that the student considers fair (and there are important motiva- 


č heed), those materials should be 


tional factors in this respect to 
sampled in accordance with the course emphasis. Sometimes an 
imbalance in emphasis simply 


examination proves to have an i 
because the writer unconsciously yielded to the temptation to con- 


‘tually much 


e latter objective, Therefore, 
to avoid the danger of 


porated in the plan an estimate of the number of questions that 
will be devoted to each phase of : 


An exact identification of the specific behaviors expected of the 
student will further refine the raw Material from which the ques- 
tions later will be developed and Provide speci 
this identification, the instructor will need to qi Between 
content appropriate for grading Purposes ang content suitable 
only for evaluating the effectiveness of instru; 
while the development of appreciation for 
major goal of a course in American history, F 
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grading students. Such efforts are likely to be more successful when 
they are divorced from grading. 

Also, the itemizing of expected student behaviors frequently 
suggests the kinds of tasks or problems which seem most appropri- 
ate for measurement. It soon becomes apparent in the process of 
essay or objective questions will better serve 
or whether an open-book examina- 
de a better measure. At this stage 
he types of questions to be used 


the work whether 
the basic purposes of the test 
tion or other project will provi 
of planning, decisions regarding t 
are quite in order. If they are made too early, however, thinking 
will be restricted to the outcomes which can be most easily meas- 
ured by a certain form of question and equally important goals 
may be neglected altogether or, at best, poorly sampled. 


Deciding on the kind of question to use 
o be a written one, and most are, a 


If the examination is t 
hether to choose a form that is objec- 


decision must be made on w 
tive, essay, or some combination of them. 

In the objective-essay controversy, even the prejudiced instruc- 
tor grants the objective-test form two advantages: it is simple to 
grade and it permits a much more comprehensive sampling of the 
course materials. While the experience of test builders indicates 
that it is possible to develop objective tests that do measure many 
complex abilities, the skeptical college instructor, reviewing his 
own student experiences, OY thinking only in terms of some objec- 
tive tests he has seen or written, questions this. He may thus be- 
lieve that essay Or discussion questions designed to elicit creative 
critical responses are preferable. But as he considers an essay ap- 
proach he also recalls how he had to ponder his former instruc- 
s, and he also is aware of the tend- 


tors’ meanings in test questions 
ency of his students to parrot his classroom discussions, or their 


readings, in such examinations. 
Either approach then—objective or essay—may have disadvan- 


tages. Thus, in deciding the issue his best course is to examine 
the values and limitations of each and to consider which approach, 
or modification thereof, better serves the intrinsic purposes of his 
instructional goals. 


Kinds of objective questions. 
reader here with descriptions of the € 


— There is no need to belabor the 
ommon forms of objective 
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test items. Ebel? provides such descriptions, states some of the car- 
dinal principles in developing them 
out, it has become increasin 
“such characteristic diff 


of trivial consequence when compared with the extreme differences 


the same form,” Therein lies a warning 
tly easy to Prepare and that the task of 
approached by the uninitiated with due 


ariations of the forms described by Ebel that 
peal for the more advanced test writer or for 


truction. These are arranged around 


wing instructional objectives: 
A, Knowledge 


1. OF specifics 


continued 
a) Of terminology 4 


b) OF specific facts 


- Extrapolation 


2 pplication 
2. Of ways and means of dealing nalysis 
with specifics - a) OF elements 
a) Of conventions b) Of relationships 
b) Of trends and sequences c) Of organizational princi- 
c) Of classifications and cate- ples 
gories | ` - Synthesis 
d) OF criteria a) Production of a unique 
e) Of methodology communication 
3. Of universals and abstractions b) Production of a plan or a 
in a field i 
Aen Proposed set of o erations 
a) Of Principles and general- c) erivation of a pit of ab- 
izations Stract relations 
b) Of theories and structures valuation 
B. Intellectual abilities and skills a) Judgme ts in term Fi 
1. Comprehension ternal evidence ogee 
2. Translation b) Judgments in terms of ex- 
3. Interpretation ternal criteria 
*Robert L. Ebel, “Writing the Test Item,” in Educational M 
193 ff. and 189. 


* Benjamin S. Bloom 
(New York: Longmans, 
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There are other sources that might also be scanned for the sug- 
gestions they provide for achievement test forms. Most of these 
works will be cited in chapter 6 for their usefulness in other re- 
spects. The Dressel portfolio of science test items provides many in- 
teresting test forms, classified by objective being evaluated’; accord- 
ing to the Taxonomy mentioned above, the instruments developed 
in the evaluation program in the Fight-Year Study, albeit almost 
as well known for their scoring complexities as for their intrinsic 
merits, are described by Smith.° Instruments developed in the two 
American Council cooperative college studies in general educa- 
tion, one directed by Tyler (1942) and one by Dressel (1954)° 
are described in some detail in the reports of those projects. The 
theory of the work-sample approach exemplified in the USAFI 
Tests of General Educational Development is described by Lind- 
quist.° 

The essay examination.—A good essay question provides for 
depth of response, and this and its influence on study are its most 
important assets. Essay questions must be as carefully planned as 
objective examinations; properly stated, questions can be used to 
estimate the student's ability to organize discussion of cause and 
effect relationships, or to evaluate conclusions, or to reveal insight 
in critical analysis. 

But the essay question also has serious limitations. Intended to 
show how well the student can marshal and organize ideas—a 
skill considered to be of permanent value—it requires that this be 


® Paul L. Dressel and Clarence H. Nelson, Questions and Problems in Science? 
Test Item Folio No. 1 (Princeton, N.J: Educational Testing Service, 1956). 

“Eugene R. Smith, Ralph W. Tyler, and the Evaluation Staff, Appraising and 
Recording Student Progress (New York: McGraw-Hill Book Co., 1942). 

1 Four volumes report this work: Paul A. Brouwer, Student Personnel Services 
in General Education; Harold B. Dunkel, General Education in the Humanities; 
Albert William Levi, General Education in the Social Studies; and Cooperation in 
General Education: A Final Report of the Executive Committee of the Cooperative 


Study in General Education. All these volumes have been published by the Ameri- 


can Council on Education, Washington, D.C. 

8 Paul L. Dressel and Lewis Mayhew, General Education: Explorations in Edu- 
cation (Washington: American Council on Education, 1954). 

E. F. Lindquist, “The Use of Tests in the Accreditation of Military Experience 
and in the Educational Placement of War Veterans,” Educational Record, 25: 357-76, 
October 1944. 

In addition to those mentioned, the reader may wish to examine other bibli- 
ographies on tests in instruction. A fairly recent one appears in: J. Raymond 
Gerberich, Specimen Objective Test Items: A Guide to Achievement Test Con- 
struction (New York: Longmans, Green & Co., 1956). 
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evaluation need not be limited to paper-and-pencil techniques. 
Often direct observation of performance is quite feasible and is 
more reliable than indirect measurement. Thus, in a laboratory 
course, it is more telling to observe whether the student can use 
the balance scales and other instruments than to have him write 
about them. If necessary, rating scales can be prepared specifically 
to identify his facility in the elements of each operation. In some 
courses special projects which call for original work may be the 
only valid way of evaluating certain kinds of achievement. For 
example, while courses in short story writing may provide back- 
ground on the principles and techniques of creating short stories, 
their primary objective will be demonstrated only when the student 
Writes an acceptable short story. 

In constructing evaluation instruments the instructor should 
not feel confined to traditional media. The effectiveness of a test 
can be much enhanced by imaginative techniques. These may be 
difficult to control, but experience will help refine them. Botany 
classes can be sent out to the college campus to examine natural ma- 
terials; graphs, art reproductions, stories of group interaction, and 
the like can be projected on the classroom screen; students can hear 
music, speeches, argument, foreign languages, literary productions, 
and the like recorded on tape; laboratory experiments in physics 
can be set in motion. These and similar media can provide the basic 
content or backdrop for testing—whether the tests be objective or 
essay—and, because they are more dramatic, can arouse student in- 
terest in the examination to a high level and thus motivate learning. 

The open-book examination.—The open-book examination is a 
technique which can be adapted to evaluating certain kinds of 
results of course instruction. One of its major values is its approxi- 
mation of real life problems for which it is more important to 
know the source of information and how to use it than to be able 
to recall it. In the same way that a writer refers to a grammar to 
solve a problem of syntax or usage, or an engineer refers to tables of 
constants or model problems to solve a mathematical problem, or a 
sociologist refers to a textbook on statistics to select the best 
method for analyzing his data, so also the instructor, in evaluating 
the student’s ability to use sources skillfully, can devise problems 
that require reference to special sources. Naturally, the problem is 
the better if it requires integration and interpretation of materials 
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Lastly, the difficulty of each question must be appraised: Were 
some questions so difficult that practically no student got them 
right? Were others so easy that virtually all students got them 
right? And how difficult was the test asa whole? 

In the following sections, aspects of test analysis—distribution 
of responses, the discriminating power of items, and item difficulty 
—are discussed, and one method of analysis applicable even to 
groups as small as thirty or forty is presented. (However, it should 
be realized that for small groups the mathematical indices which 
the method provides will fluctuate more than with large groups. 
To obtain reasonably stable indications of item difficulty and dis- 
crimination on the basis of one test administration, it is recom- 
mended that analysis be based on 100, and preferably more, cases.) 


Distribution of responses 

The distribution of responses may be obtained as follows: 

l. Place test papers in order of score from highest to lowest. 

2. Select the one-third of the papers having the highest scores 
and the one-third of the papers having the lowest scores. 

3. For each item in turn, tabulate the number of students in the 
top one-third (hereafter called the Highs) and the number of 
students in the bottom one-third (hereafter called the Lows) choos- 
ing each possible response to the item. 

The tabulation might look like this for a four-choice multiple- 
choice item: 


Item 38: The height of the tide is dependent, in part, upon the 
position of the moon in relation to the 


A planets 

B earth 

C plane of the ecliptic 

D sun 

RESPONSES 
ITEM Gi 
No. ROUP è Omitted A Bt G D Not 
Reached 
38 Highs 5 23 50 10 12 
Lows 8 35 10 42 3 2 


* Correct answer. 
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It would appear from the figures above that item 38 dienti 
nated quite well between Highs and Lows, since five times as TAN 
Highs as Lows responded correctly, Distractors A and C operate 
well, since both drew a large number who did not know the cor 
rect answer. Distractor D, however, bears study, as more Highs 
than Lows chose this as an answer. It is entirely possible that 
choice D may have been so worded that the Highs read into the 
distractor more than was intended, or the question, as in the case 
of the example given, may have had two correct answers. T hus 5 
is that the distribution of responses often reveals ambiguities no 
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Discriminating power of the item 


OE prime importance is the question of how well an item dis- 
criminates between good and poor students. An index of discrim- 
ination of an item may be obtained simply by finding the differ- 
ence between the number of Highs and Lows answering the item 
correctly; this difference between the two sets of correct responses 
divided by the maximum possible difference gives the ratio to be 
used as the index of discrimination. 

Although the desirable size of the index of discrimination will 
vary according to the purpose of the test, the range of ability within 
the group, the size of the sample, and the complexity of the ma- 
terial, it is generally thought that, for most achievement tests, 
indices that are negative or that range from 0 to .20 are low; 
those from .20 to .40 are average, while those of .40 or more are 
highly discriminating. Obviously, the minimum and maximum 
values of the discrimination index are minus 1.00 and plus 1.00, 
respectively. 

In the kind of analyses presented above, the assumption has 
been made that the total score of the test is a true indicator of 
achievers and nonachievers. This assumption is safe if the test is 
both a good one and if it is the aim of all questions to reflect the 
same ability. Since classroom tests seldom measure one objective 
only, this is not a safe assumption for most achievement tests. In 
some analyses it is assumed that, while several objectives are 
represented in the questions, good students will be successful in all 
of them and poor students in none; this again is a questionable 
assumption. Unless further examination of the test items is made, 
either of these assumptions could lead to the incorrect discard of 
perfectly good questions which happen to exist in the test in a 
minority and therefore carry less weight in the total score; or they 
may similarly lead to the perpetuation of test items of a certain 
nature, not necessarily most desirable, because they happen to be 
present in the test in great majority and therefore contribute 


"This method was proposed by Robert L. Ebel. Details of the method and 
the assumptions involved are discussed by Ebel in “Procedures for the Analysis of 
Classroom Tests,” Educational and Psychological Measurement, 14: 352-63, 1954. 
Methods utilized by other technicians are described by Frederick B. Davis in 


Educational Measurement, pp. 266 fE. 
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mately a third of a class emphasized a particular chapter in their 
study because they guessed it to be the instructor’s field of spe- 
cialty, those students would be high performers in the test and 
item statistics would be spuriously good; this would not mean the 
test was good, especially if it was supposed to be representative ofa 
unit of work that covered, say, six chapters. 

Item analyses, if properly interpreted by the instructor, can 
indicate the quality of his examination effort, suggest areas where 
learning has been poor, and help determine whether there are 
individuals of the class who require remedial work. Item analyses, 
however, are neither infallible nor completely informative: they 
never disclose directly how the individual student reacts to the 
particular questions. Thus, sometimes, item analyses need to be 
supplemented by interviews or by class discussion of questions to 
elicit a deeper and more exact reflection of student thinking. It 
may be found, for example, that students answer the questions cor- 
rectly for the wrong reasons, Or that a question which was intended 
to require primarily only factual recall elicits from the student 
complex rational thinking that involves much more than factual 
recall. This “thinking out loud” with students is profitable to both 


instructor and students. 


5. The Use of Tests in Educational 
Counseling 


FROM THE EARLIER DISCUSSIONS OF TESTS FOR ADMISSIONS AND PLACE- 
ment purposes, the reader will see that essentially tests used for 
those purposes also serve educational counseling functions. The 
present chapter is concerned with some additional uses of scholas- 
tic aptitude and achievement test data by both faculty counselors 
and administrative personnel in charge of college guidance activi- 
ties. While other types of instruments and techniques will be cited 
as important aspects of the counseling process, they will not be 
discussed in detail. Also omitted are measures of nonintellectual 
factors because (a) they have not yet been developed to the level 
of technical excellence and ease of administration comparable to 
that of measures of scholastic aptitude and achievement, and (b) 
their use requires training and experience so specialized that a 
limited description would fail to define their proper application 
in good clinical counseling. The discussion here will be of prob- 
lems and techniques more generally useful to the nonspecialist. 


DETERMINING COUNSELING NEEDS 

Counseling often takes place on an informal and part-time basis. 
Admissions officers, for example, frequently advise prospective stu- 
dents (and their parents) regarding their chances of success at a 
given college. Members of the general administrative staff and 
department heads answer many queries concerning course require- 
ments, programs of study, and so on. Faculty members, in their 
capacities both as advisers and teachers, help students avoid irreg- 
ularities in course sequences, work with them in developing suit- 
able attitudes toward study, and aid them in seeing the relation- 
ship between their studies and their needs. Counseling also occurs 
directly in classroom instruction, particularly in courses designed 
to influence attitudes, with the skillful instructor stimulating the 
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student to assume responsibility for raising questions that perplex 
him so that they may receive specific attention; the relationship 
so engendered carries over in their individual contacts outside the 
classroom. 

Because the counseling function is spread among so many per- 
sons, there is a danger that some serious problems may not be 
recognized and that some students will, therefore, fail to receive 
adequate help. This danger can be minimized if a college evolves 
a clearly defined point of view toward its responsibilities for pro- 
viding specific kinds of guidance services and when the nature of 
those services becomes understood by all members of the college 
community. Therefore, a college that is initiating or expanding 
counseling services must give serious attention to assessing the 
nature and scope of student problems on its campus and then 
determine what it can and should do in counseling. 


Studying the causes of student difficulties 


Although no single approach to the appraisal of college counsel- 
ing needs will cover all contingencies, the illustration of a hypo- 
thetical college which is trying to identify the causes of academic 
failure among freshmen will set forth a number of basic principles. 

Let us assume that at College X, despite careful selection and 
placement procedures, academic failure among freshmen is alarm- 
ingly frequent. Some faculty members attribute this to the poor 
study habits of the students; others, to their lack of preparation in 
tool subjects; still others, to adjustment problems stemming from 
the excitement of living away from home; and there are those who 
maintain that the general caliber of the student body is not equal 
to the curriculum. A systematic review of the data on hand— 
secondary school records, aptitude and placement test data, and 
certain other additional data—may serve to identify one or more 
causes of freshman difficulties. To begin with, the college may 
review the class admissions and placement data and compare them 
with similar information for previous classes and may find that the 
class as a whole is of about the same caliber as recent entering 
classes. 

It might then be advisable to examine the relationship between 
the various data and actual failure: it may be found that, by and 
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large, the students with the poorest admission and placement data 
are having the most trouble. But the relationship probably will 
not be clear-cut, for some will be found to be doing well, and 
some of the better students, failing. A closer look at the back- 
grounds of the failing students may indicate that some of them had 
had problems of personal-social adjustment while they were still 
in high school, which might be affecting their work now. But 
again this will not be true for all. 

Although the review thus far has yielded some information on 
possible causes of failure that might have been expected for some 
students, it by no means has accounted for enough cases to 
satisfy the inquiry. Therefore, several other approaches might be 
revealing. The courses failed might be reviewed to determine if 
failures are confined to one course. Freshman faculty advisers 
might be asked to report any personal-social adjustment problems 
of their advisees, and the students might be asked to complete a 
study habits inventory and a personal-social problems check list. 

With this additional information, the college can further nar- 
row down its study of freshman failures, getting at specific reasons 
for, say, a high proportion of failures in physics. It may well be 
that the students do not have sufficient skill in mathematics to 
handle the material as it is taught, despite satisfactory scores on the 
mathematics test. A further inspection of the test may indicate 
that it does not include some of the skills and abilities which the 
physics instructors are taking for granted. Similarly, failures in 
other courses, and the extent to which study habits or personal 
problems really are contributing to failing grades, may be ex- 
amined. 

When as much information as is thought to be available has 
been accumulated on the matter of freshman failures, College X 
must next face the problem of what to do to remedy the situation. 
What can and will be done will depend in part upon its academic 
traditions and philosophy and in part upon what is administra- 
tively feasible in terms of staff and budget. 

There may be the feeling that all the colle 
attack the problem on a group basis by (1) sel 
priate mathematics placement test, (2) establ 
credit mathematics course for students maki 
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utilizing a special period in the first weeks of the semester for 
intensive training in effective study habits for the deficient group. 
Any organized work with individual students, especially those who 
appear to have adjustment problems, it may be decided, will have 
to wait until the college can afford additional staff. Or, it may be 
thought essential, in addition to these administrative and instruc- 
tional changes, to begin as soon as possible to work with indi- 
viduals, and the college may therefore immediately secure the 
services of a person trained in counseling. It may then institute 
an in-service training program for those members of the instruc- 
tional staff who seem most interested in and capable of handling 
guidance problems, and schedule interviews so that entering stu- 
dents may review the possible or actual difficulties facing them 
and the courses of action that might be adopted to circumvent or 
correct them. 

This illustration, though largely hypothetical and by no means 
universal in its application, demonstrates several points: the role 
of tests in a survey of remedial and counseling needs; the supple- 
mentary nature of test data in surveys of group and individual 
problems; the choice of the instruments appropriate to the 
problem or problems being investigated; the use of data from 
other institutional programs for surveys or for individual counsel- 
ing needs; the values of local, as well as published, normative data; 
the possible contributions of group surveys in the identification of 
problems for administrative planning purposes; and the choice of 
follow-up action that is appropriate to local conditions. 


Defining categories of students who should be counseled 


The establishment of the categories of counseling can be ap- 
proached in several different ways. One way is to establish them 
in terms of particular kinds of educational, vocational, or personal 
problems that students experience. Thus, in the example just 
cited, the college identified such problems as academic deficien- 
cies, poor study habits, adjustment problems, and so on, as fac- 
tors which contributed to freshman failures. In another sense, 
however, it established a working category in terms of academic 
levels. It identified freshman students as a group that might need 
special assistance. This establishment of categories in terms of 
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academic levels can often form a practical basis for helping stu- 
dents with their problems, since each different period in a sru: 
dent’s career brings somewhat different stresses and strains which 
may be alleviated by providing some kind of systematic counsel- 
ing. 

The establishment of functional categories may be considered a 
longitudinal dimension for the organization of guidance services. 
The second approach may be considered a horizontal dimension 
for the establishment of counseling activities. Neither dimension 
is mutually exclusive. Obviously, sophomores as well as entering 
freshmen may have faulty attitudes which interfere with effective 
academic performance, and freshmen as well as seniors may be 
concerned about their future vocational plans. Just as the two. 
dimensions are not mutually exclusive, it is almost equally certain 
that no one college would wish to establish its counseling activities 
solely in terms of one rather than another dimension. Neverthe- 
less, viewing counseling problems first in one way and then in 
another can be helpful in planning for and working toward an 
appropriate balance between counseling students who are chosen 
according to administrativel 
individual students who, 


help. 


y defined categories and counseling 
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interests to the academic and social program that the college 
offers. Both the college and the applicant will benefit from this 
review. The college can be more certain of the wisdom of its 
ultimate decision in accepting or rejecting the applicant; and for 
the applicant, whether he is accepted or rejected, it can be an im- 
portant step in self-knowledge. He may be genuinely puzzled about 
the demands of college life, uncertain of his own ability to com- 
pete in a more selective academic environment, unsure of his fu- 
ture vocational plans, and so forth. If the data collected in the ad- 
missions process are wisely used, he can be assisted to a better under- 
standing of his problems and adjustments. 

The student's secondary school record will not be new to him. 
What will be new is how his record compares with that of other 
freshmen who have attended different secondary schools and with 
those of older successful students in the college. In these terms, 
is his record good, poor, or mediocre, and what does it mean in 
terms of his probable success if he is admitted? Even if no other 
data were available, this review of the student’s school record 
would be beneficial to him. 

In many instances, however, additional information is available. 
Many colleges require tests as one of their admissions procedures 
—for example, a test like the College Board Scholastic Aptitude 
Test or the School and College Ability Tests, which yield verbal 
and quantitative scores, or a test like the Ohio State University 
Intelligence Test, which yields a total score based largely on verbal 
material. Whatever the test, the college usually will have studied 
the relationship of its scores, and other admissions data, to success 
on its campus. Thus, the admissions staff can help the applicant 
determine his chances of success and, further, consider the pro- 
grams for which he is best fitted, if indeed he is generally pre- 
pared for that college. No final decisions concerning his academic 
program are made at this point, for if he is accepted, he will likely 
review this information, and perhaps additional data, with a fresh- 
man adviser. But in the meantime, he and his parents can be giving 
some thought to his future plans. Perhaps he has entertained 
thoughts of studying engineering without having a realistic pic- 
ture of the required background and skills. If he learns that his 
preparation and ability in mathematics are only minimal for engi- 
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neering, and learns more about other options available to him, he 
may decide to follow some other program. Or, he may decide T 
give engineering a try, but at least he knows in advance that he 
may run into trouble. 

Then there is the applicant who does not meet the standards 
of admission to the four-year degree-granting programs of the col- 
lege. He may be guided into a two-year program at this college oe 
some other institution, Frequently, such a suggestion will require 
the student and his parents to reconsider his plans both in terms of 
accepting such a solution or in thinking about vocational areas 
for which two-year programs offer preparation. 

Many colleges also include one or 
their admissions requirements. 
on these instruments 
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test data should be made easily available to the faculty advisers 
and counselors who are guiding freshmen. 

The kinds of tests incorporated in the freshman guidance pro- 
gram should, to a large extent, depend not only on the courses of 
study and the breadth and scope of guidance services the college 
offers, but also on the extent to which tests are used for admissions 
and/or course placement purposes. If the college does not employ 
tests for either purpose, it is obvious that the tests used to help 
freshmen should include a measure of general ability and tests of 
achievement of those skills or basic understandings which are con- 
sidered prerequisite to required college courses. But, if a scholastic 
aptitude test has been used in admissions and if placement is based 
in part upon test performance, considerable information is already 
available, and time which would otherwise be devoted to testing 
may be devoted to the counseling process itself or to the adminis- 
tration of supplementary tests. 

What additional tests the college will find helpful will depend, 
in part, upon the scope and complexities of its curricular offerings 
and policies and the comprehensiveness of its guidance services. 
Many colleges, especially those offering programs oriented toward 
specific vocations, incorporate in their freshman testing programs 
tests of special abilities (music, mechanical, and art aptitude) and 
they administer them, if not to all students, at least to those who 
wish to register in these special programs. Some colleges admin- 
ister measures of interest and study habits. Still others may admin- 
ister a group test intended to measure personal and social traits in 
order to identify those students who show signs of needing special 
help. Students so identified will then be encouraged to meet with 
counselors who are particularly well qualified to provide personal 
counseling. In a majority of situations, however, the information 
which will be of most potential value to the adviser and the student 
will be that which provides information on the student’s academic 
skills and interests. 

An obviously gifted student may be encouraged to take a more 
difficult program. If the college provides opportunities for gifted 
freshmen to by-pass prerequisites, these will be explained to him, 
and he may be encouraged to seek exemption in areas in which he 
appears strong. If the student’s achievement record as well as his 


66 ROLE OF MEASUREMENT 


scholastic aptitude scores show he is marginally prepared, the wis- 
dom of taking a minimum program will be discussed with him. If 
the college offers remedial programs on a voluntary basis, he will 
be encouraged to enroll in them if it is probable that he will bene- 
fit from the special work. 

Scholastic aptitude and achievement test data, then, combined 
with high school grades, statements of the student’s interests, and 
records of previous school and work activities, when used with in- 
sight into and understanding of the student that the counselor gains 


in the interview, provide a wealth of information to the counselor 
in aiding the student. 


Sophomores 


Students nearing the end of their sophomore year have presum- 
ably reached the point where they will make a decision about 
their fields of major concentration. Some of them, indeed, came 
to college with their minds already made up, and all of their work 
thus far, barring basic general requirements, has been related to 
their choices. Others have come with relatively clear goals but have 
found the going harder than they expected. Thus, we may find 
hopeful mathematics majors facing almost certain failure if they 
continue mathematics, yet uncertain about the choice of another 
major. Other students find themselves entertaining two or more 
choices which appear equally attractive. Or some students simply 
do not appear to be especially interested in any particular field. 
All except the students who are performing successfully in areas 
related to their proposed major require some help in thinking 
through the decisions they must ultimately make. And almost all 
need help in relating their education 
vocational plans. What data will be | 
counselors? 

Obviously, the student’s academic re 
of college will be an important piece of 
will be scores on the various tests he h 
entrance. In addition, qualitative data about his college activities, 
his personal relationships with faculty and other students, social 
and career aspirations, and so forth, will also enter the picture and 
lend nuances of meaning to the data at hand. 
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The student who has a record of failures and barely passing 
grades as well as low achievement scores may appear to be a clear- 
cut case to the counselor—he should not continue in a college 
where the enrollment becomes more selective and the competition 
greater each year. Yet the case is far from clear-cut if the student's 
general average is just high enough to permit him to stay and if 
he wishes to stay. He may have just enough determination to make 
the grade if he remains at college, or he may be able to make 
better grades in his major subject, and for this reason alone one 
would be reluctant to advise categorically that he leave. But the 
cards do seem stacked against him, and his situation should be 
made clear to him with all of the tact and understanding that can 
be brought to bear in the situation and, whenever possible, alter- 
nate courses of action suggested. 

The student with an erratic performance record frequently pre- 
sents an easier problem. Often his test scores will corroborate his 
academic record as far as his strong and weak points are concerned. 
They may add little more than this, although at times they may 
indicate that an able student has antipathies to subject, teacher, or 
teaching methods. In any event, scores are often helpful in con- 
firming the student’s choice of major field, since in most cases a 
student with a particular area of competence will tend to choose 
a major field that is compatible with his strength. 

Even though the test record may at first glance seem only to 
substantiate the academic record, it can on occasion be even more 
discriminating, for while the academic record shows the student 
how he stands in relation to other students at his college, the test 
record shows how he stands in comparison with a larger group of 
college students and thus presents him with a better picture of his 
strong and weak points. This can be especially valuable if the stu- 
dent is contemplating a professional choice which may require 
graduate training or further academic competition. 

The student, too, whose record shows that he can succeed at 
almost anything to which he turns can, and often does, confound 
the counselor. If he has already decided to major in history, there 
seems to be little advantage in showing him that he could be 
equally successful in English or American literature; or if he has 
chosen physics, there is no point in indicating that he could do just 
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as well in chemistry or mathematics. There may be need though 
to review his vocational plans to see if another, though perhaps 
related, area will better serve his plans. 

The student who appears to have no outstanding interest in any 
one field can be most puzzling. Perhaps he is competent in the 
quantitative or verbal areas, but has no clear preference for a spe- 
cific field within either category and as yet has only the most tenta- 
tive ambitions. Here the counselor will have to muster all the 
information he can about the student’s major interests through a 
review of his activities, likes, and dislikes and through the results 
of an interest inventory in order to help him identify the possible 
courses open to him. Frequently, for such a student, summer or 


other work experience helps to develop his attitudes and to 
sharpen his thinking. 


Juniors 


Unlike entering freshmen, juniors are not adjusting to college; 
unlike sophomores, they are not reaching that crucial point where 
they must choose their major programs, and unlike seniors, they 
are not yet ready to leave their ivory towers. As a result, they are 
usually exempt, by and large, from any institution-wide testing 
programs designed to aid in their selection, placement, or guid- 
ance, or designed to evaluate the extent to which certain institu- 
tional objectives are being achieved, Nevertheless, though we may 
ignore the educational needs of juniors as a class, individual jun- 


iors may need help. A junior occasionally finds that either because 
of academic difficulties or a chan 
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Seniors 
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marized in the question: What do I do after graduation? For this 
group, then, the problems are more likely to fall into the realm of 
career rather than educational counseling and thus fall outside the 
limits of the discussion here. Nevertheless, it may be useful to re- 
view briefly some of the problems they face and to suggest a few 
ways in which they may be helped even though specially trained 
counselors are not available. 

Most seniors are attempting to make their first job choices. A 
number of them may want considerable help in making this choice 
or may at least want some assurance that their selections are wise 
ones. Even the engineering graduate who sees a number of indus- 
trial recruiting officers needs help in deciding whether he is better 
suited for sales or production engineering, for design or research. 
And if he has settled that question, he may have specific points to 
resolve regarding the relative merits of a large or small company 
and other similar questions. For those whose college training has 
been less oriented toward a particular professional area, the prob- 
lem of a first job choice can be difficult. Personnel officers are not 
clamoring at their doors; they must do the knocking. When and 
how should they do this? This can be an irksome and difficult 
problem for the senior even if he has decided that he prefers in- 
surance to accounting or advertising to banking. 

More difficult are the problems facing the senior who has only a 
vague notion of what he would like to do, or perhaps only clear 
ideas of what he would not like to do. And still more difficult per- 
haps are the problems facing the would-be lawyer or doctor who 
has not been accepted in a professional school and must therefore 
make an alternate career choice, clearly not his first preference. 

All these students need information—information on occupa- 
tional groupings and the training necessary for beginning careers, 
on specific types of jobs within occupational families, and on how 
to write letters of application and how to approach the job inter- 
view. Much of the counseling at this stage will therefore be di- 
rected to showing these students how they may obtain the informa- 
tion they need. A certain amount of it will be aimed at reassut- 
ance and may involve the review of the student’s college record— 
grades, test scores, and activities. Some of it will be directed toward 
his developing greater self-knowledge and helping him to develop 
a problem-solving approach to the choice of a career. Again, this 
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may involve a review of all the evidence available and the neces- 
sity, perhaps, of collecting additional data, such as scores on an 
interest measure, to help him correlate what he likes to do with 
what he is able to do and with what the working world requires. 


A VIEW OF SOME SPECIAL TYPES OF 
COUNSELING PROBLEMS 
It may now be helpful to take a somewhat closer look at several 
problems which may appear at any of the academic levels, but 
which are particularly critical as students begin their college work. 


Students with academic deficiencies 


Even when selection is highly competitive, there will be some 
students who in terms of both general ability and academic prepa- 
ration are marginally prepared for a particular college. In colleges 
with lower admissions requirements, there is likely to be an even 
greater number of such students, many of whom face almost cer- 
tain failure unless they receive help, and many others, unfortu- 
nately, who may fail even if they do receive help. It is important 
then for colleges to identify these students early and especially to 
distinguish between those who may profit from special help and 
those who should be guided toward other types of training. 

The weakest students will be those who fall at the low extreme 
of their classes according to all the data available—test scores and 
academic records. For some of these students, remedial work in 
such tool subjects as reading and mathematics may be beneficial, 
since lack of skill in these areas may have been the cause of lack 
of success in content subjects and of poor performance on aptitude 
tests. Unfortunately, however, this will not always be the case. Stu- 
dents performing at a relatively low level in the basic skills may 
actually be performing at a level commensurate with their gen- 
eral ability. In such cases, remedial programs will not usually 
bring about any significant improvement in basic skills. However, 
since it is rarely clear which came first, the low ability or the poor 
skills, it is often desirable to give such students an opportunity 
to obtain special help. At the same time, they must be helped to 
realize that they may face academic difficulties and that, perhaps, 
vocational training will be more compatible with their abilities, 
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In other cases, there is considerable likelihood that students will 
benefit from remedial work if they apply themselves seriously. 
Such students will demonstrate a weakness in a basic skill area 
while performing at a relatively higher level on a measure of 
scholastic aptitude. Students who show this pattern in their test 
and academic performance often have the necessary general ability, 
but have failed to master some of the basic skills for one reason 
or another. If helped early in their college careers, they can often 
compensate for their earlier lack of proficiency and do creditable 
college work. 

Other problems of academic deficiency will occur as students 
progress through college. Some of these will be serious enough to 
jeopardize their chances of graduation, and efforts should be made 
to understand the possible causes. Is the student working at his 
optimum level? If this is the case, should he attempt to continue 
or should he be guided toward considering alternative courses of 
action? If his academic and test records indicate that he should be 
able to perform at a satisfactory level, what factors may be contrib- 
uting to his unsatisfactory performance in one or more courses? 
Is he in a field which holds little of interest to him? Does he have 
poor study habits? Is he preoccupied with personal problems that 
vitiate his efforts to concentrate? Is he too busy with campus activi- 
ties, work, community, or social activities to organize his time effi- 
ciently? Is he physically well? Tests, of course, will not supply the 
answers to all of these questions, but they will help the counselor 
and the student reach the conclusion that he is not doing as well as 
he is capable of doing. With this established, one usually needs to 
look elsewhere to establish the real causes of difficulty. 


Gifted students 

The need to counsel gifted students is likely to appear less ur- 
gent on most campuses than the need to counsel the academically 
weak. Unless the gifted student is so little challenged that he is 
unable to channel his abilities effectively or has personal problems 
which interfere with his ability to concentrate, the gifted student is 
likely to do acceptable, although not always outstanding, work. He 
does not, therefore, tend to come to the attention of the faculty 
or administration as one who needs help. But because of his po- 
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tential as a prospective high-level professional person or technician, 
he should have all of the help possible in planning a challenging 
program of study. . 

There are two aspects to this problem. One is that of identifica- 
tion. Who are the gifted students, and how do we know whether 
or not they are performing below their capacities? Test data and the 
other kinds of evidence we have discussed will be helpful here. 
High test scores and only average or slightly above average aca- 
demic grades will indicate that a particular student needs encour- 
agement, assistance, or a forthright “talking to.” High test scores 
and a good academic record may indicate that a student is working 
near his capacity; on the other hand, it may not. We have no way 
of knowing whether he could do better until he has attempted 
more difficult work. 

This brings us to the second aspect of the problem. What can 
be done to provide the gifted student with greater challenge? Here 
it will be largely a matter of what a particular college can provide 
in the way of added intellectual stimulation for the very able. One 
question concerns the number of such students on the campus. 
Does a very bright student have any peers, or does he stand alone? 
What training and preparation do individual faculty members 
have that can contribute to education of the gifted? Are there 
instructors in each department who are themselves strong enough 
to offer the challenge these students need? Are the local library 


research facilities adequate for stimulating independent study and 
research? And so, question can be piled 


mining whether a particular campus is able or willing, in terms 
of its traditions and purposes, to provide the kind of intellectual 
environment which will meet the needs of the most gifted. Assum- 
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of, or in addition to, personal problems—academic problems 
which stem from a cultural impoverishment of their homes and 
environments. They may lack the breadth of intellectual associa- 
tions to make much of their college work meaningful and enjoy- 
able. They may be openly hostile to certain kinds of ideas in 
course work or simply indifferent to the best efforts of some of 
their professors. Others may be sincerely interested in making up 
their deficiencies, but find the going difficult. Both types of stu- 


dents present educational counseling problems and will need help 


in scheduling their course work so that they will experience suc- 
cess and have their intellectual curiosities aroused. 

In a category by themselves, perhaps, will be students from other 
countries who, in addition to having come from different cultural 
backgrounds, will have come from school situations which differ 
markedly from those in the United States. The content and se- 
quence of course work may have been quite different. Some will 
be better prepared in certain subjects than many of our stu- 
dents; in other areas, they will be less well prepared. How can 
their academic preparation be evaluated so that they may be 
placed in courses which offer both success and challenge? 

For these students, as well as for some of the subgroups within 
our own culture, tests are likely to be less helpful than they are 
with the average white middle-class college student. We can be 
sure in most cases that these students are at least as good as their 
test scores indicate, and in many cases there is every likelihood 
that they are better. But how much better is a question which 
most colleges are unable to answer either for groups or for indi- 
viduals, simply because they may not have sufficient data on such 
groups of students to know what kinds of school records and test ' 
scores best separate the academic successes from the academic fail- 
ures. Only a history, compiled over a long period of time, will help 
to build up information which can be used to predict the likeli- 
hood of success for these students. 

Many otherwise bright and well-educated foreign students will 
have initial difficulties with their work in American colleges be- 
cause of insufficient knowledge of English or even unfamiliarity 
with classroom teacher-student relationships. Whatever the prob- 
lems, they need special help in making satisfactory social adjust- 
ments to the mores of American college students while at the 
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same time retaining their own national individuality so that they 
do not become alienated in thought and custom from their own 
people. 

There are, of course, other categories of students who need spe- 
cial counseling if they are to make satisfactory adjustments in 
college and in later life. They may be students with severe physical 
handicaps who need special educational and vocational help as 
well as personal assistance if they are to learn to lead lives that are 
optimally useful and happy. Others are students with emotional 


disturbances. Most of these problems, however, fall within the 
realm of personal counseling. 


UTILIZING TEST AND OTHER DATA IN 
EDUCATIONAL COUNSELING 

In the preceding sections mention has been made of the role 
that scholastic aptitude and achievement data can play in dealing 
with the various types of educational and vocational problems 
that college students face. The full value of the data used in the 
counseling process, however, will be realized only if they are in- 
terpreted within frames of reference which have meaning for the 
problems being studied. This is true for all the data being used, 
but presents special problems when test scores are being employed. 


The use of published norms 
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in the norms population. Also, in a few situations, where a college 
has not yet been able to establish its own norms, published norms 
may serve as a point of departure in arriving at decisions that 
would be difficult without some comparative information. 


The value of local norms 

By and large, the use of tests in educational counseling is a mat- 
ter of prediction, and prediction involves criteria of success in a 
well-defined situation. Thus, at the college level when a student 
is planning his educational program, he must do so in terms of 
the demands of a particular college and in terms of local competi- 
tion. For problems of this type, local norms will be of primary 
importance. It does not matter whether the student stands at the 
65th percentile on the published norms on a scholastic aptitude 
test if on local norms he stands at the 30th percentile. Regardless 
of whether he is about average or slightly above average in general 
ability as far as college students in general are concerned, on this 
campus he is in the lower third and in all likelihood will find the 
going difficult unless he develops excellent study habits and applies 
himself diligently. So also in achievement areas he is better able 
to see how his academic preparation compares with that of other 
students. 

To be of most value, however, local norms must be carefully com- 
piled. In small colleges this may often mean waiting two or three 
years, sometimes longer, so that a sufficient number of students 
will be represented in the norms to make them dependable. There- 
after, they should be checked not only when there are good reasons 
to expect marked changes in the characteristics of the student body 
but at periodic intervals, since gradual and unnoticed changes in 
the enrollment may be taking place.’ 

A college can use local data in educational counseling in an- 
other important way. Because the norms, usually percentiles, for 
all the tests are based on the same student population, comparisons 
across tests can be made. While publisher’s norms can sometimes 
be used for comparison purposes, they offer the serious limitation 
that it must be assumed that the norms are based on populations 
that are very similar to, if not the same as, the college's student 


1 Methods for developing local tables of percentile ranks are presented in almost 
any standard statistics text. 
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population. Use of local norms obviates any assumption concern- 
ing comparability. 


THE NATURE AND SUCCESS OF COUNSELING SERVICES 

The nature of the counseling services in any institution reflects 
the educational and organizational philosophy of the institution’s 
guiding personalities—their concept of what counseling can and 
should do; their estimate of the support for counseling that is ad- 
ministratively defensible in terms of money, staff, and facilities; 
the ability level of the student body, the level of scholastic and 
personal preparation for college, and other pertinent character- 
istics. 

Effective counseling services, in which the roles of administra- 
tive counselors, faculty counselors, student personnel counselors, 
and specialized testing and guidance counselors are well coordi- 
nated, requires careful planning, careful staffing, and continued 
adaptation to the evolving needs of the student body. These needs 
change as the proportion of veterans changes; as the proportion 
of students with college-educated parents changes; as the ability 
level and level of preparation of the students change; as the eco- 
nomic prosperity of the students, of their home communities, and 
of the college community changes; as the ratio of male to female 
students changes; as the tendency of students to marry during col- 
lege years changes; and so forth. 

The success of counseling services depends primarily upon the 
human and professional qualities of the staff and their willingness 
to put their hearts as well as their minds to the task of doing what 
is best for the long-range development of e 
Sensitive to administrative necessities. Stu 
recognize sympathetic and understandin 
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*Some batteries of published tests provide for comparability. If the college is 
using such a battery, it need not develop local norms before plotting student 
scores on profile charts. In many instances, however, it will be found helpful to 
substitute local norms for published data, or to superimpose local norms on the 
published data. 


b. The Use of Tests in General 
Institutional Evaluation 


IN THE LAST SEVERAL CHAPTERS THOSE APPLICATIONS OF TESTING 
related chiefly to the day-to-day operation of the educational insti- 
tution have been considered. For the most part, they have con- 
cerned provisions for the welfare of students—the initial screening 
to determine their fitness for particular programs, their guidance 
after admission, the improvement of their courses, and the evalu- 
ation of their progress in courses and of their interest in general. 
In addition, test results can supply policy-making groups of an 
institution with some of the evidence needed to reach decisions for 
the collective good. Scores can also contribute information when 
an institution wishes to take an over-all view of its educational 
atify procedures that seem to be accom- 


enterprise in order to ider 
plishing the purpose for which they were created and procedures 


that are in need of modification. 

On those occasions when tests are selected or constructed for 
measuring various instructional objectives, they often become the 
focal point of a dynamic process of exchange of opinion and thus 
serve to get issues described, reviewed, and clarified for the bene- 
fit of all concerned. 

The present discussion considers the design of the more com- 
mon institutional self-surveys that employ tests as one means for 
evaluating attainment of major educational goals and needs. A 
comprehensive institutional evaluation will necessarily be long- 
range in plan and will continue over several years, since even the 
first step necessitates a complete description by the instructional 
staff of the outcomes expected. The preparation of the descrip- 

2For a list of the kinds of questions faced at the outset in an evaluation of gen- 
eral education, for example, see Paul L. Dressel, “Evaluation Procedures for General 
Education Objectives,” Educational Record, 31: 97-122, April 1950; or see Paul L. 


Dressel and Lewis Mayhew, General Education: Explorations in Evaluation (Wash- 
ington: American Council on Education, 1954), p. 21. 
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tions takes time, and even more time is required for the actual 
outcomes to be demonstrated by the instructed. The analysis be- 
gins with the formulation of a complete blueprint of the major 
Outcomes that are expected from all the units of the program. 
From among the outcomes must be selected those which reflect the 
chief concerns of the institution. Sometimes the selection may be 
of a particular unit of instruction and therefore the evaluation 
limited to a narrow sector of the institution’s program. This would 
be the case if it were decided, for example, to examine the institu- 
tion’s offerings in modern languages. On the other hand, some 
objective of the institution which is much more pervasive in its 
effects might be selected; under these circumstances, the evalua- 
tion endeavor can become quite complex. 

The most logical and defensible approach to the selection of 
objectives for evaluation is that which emphasizes the attainment 
of outcomes considered to be most important in every unit of the 
educational process, or perhaps most important to all units collec- 
tively; however, circumstances of expediency (time, budget, per- 
sonnel, available instruments, and the like) usually dictate other 
criteria of selection. Selection of objectives to be appraised at a 
particular time may then be related (a) to some one unit, say, to 
the area of humanities, with plans to consider other divisions be- 
ing held in abeyance; or (b) to a particular level, say the sopho- 
more class, since the dropout rate at that level indicates that the 
two-year product is representative of the institution and it may 
be desirable to know what he is like; or (c) toa sophomore class, 
to determine whether the background and skills required for con- 
tinuation are being mastered; o 
area, to demonstrate that it is a weak link in achievin 
after integration; or (e) to areas in w 
lished. (If the institution does have res 
and, thus, greater flexibility in devising instruments to measure 
outcomes, it is best to confine the test construction to measures 
that supplement, rather than duplicate, published tests.) 

While the above list is obviously incomplete, it does list some 
of the choices an institution may make instead of attempting to 
measure many objectives with less effectiveness. At the same time, 
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an over-all blueprint and long-range plan will serve to place the 
selected portions in proper perspective so that they will not re- 
ceive undue emphasis merely because they are temporarily identi- 
fied for study. 


THE APPRAISAL OF GENERAL VERSUS SPECIALIZED 
OUTCOMES OF EDUCATION 


Most American colleges offer a two-pronged program, endeavor- 
ing, on the one hand, to impart general knowledge and skills to 
students through traditional liberal arts courses and/or programs 
other, to develop proficiency in a 


of general studies and, on the 
more specialized field of knowledge. Although there are differing 


patterns for accomplishing these objectives, it is now common to 
stress the first purpose in the first two years of instruction and the 
second in the last two years. Regardless of the pattern of courses 
and programs adopted, the use of some measure of degree of at- 
tainment offers a starting point for the evaluation of the college’s 
educational process. 

When the general goals and purposes have been defined in terms 
sufficiently concrete to be identifiable for evaluation, it is then 
possible to determine what evidence will reveal progress toward 
attainment. For evidence of progress toward some goals, records 
may provide the most appropriate information; for others, special 
observations by the faculty may be required; and so forth. It is in 
the measurement of academic accomplishments that well-con- 
ceived and well-constructed tests may be of special assistance. 
Where course objectives and subject matter are unique, only a 
locally prepared test can reflect the content and emphasis and 
constitute a suitable appraisal device. If the purpose is to measure 
certain broad outcomes common to many institutions, a published 
test may be quite suitable and, all factors considered, certainly 
less costly. 

The timing of measurement constitutes another matter for de- 
are expected to attain the objectives of the gen- 
ng the first two years of college, meas- 
ade at the end of that period. Often, 
or year also will be valuable in 
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determining whether the outcomes in general education have been 
retained or in determining the effect of upper-class instruction on 
further attainment in this area. 

Any tests used to measure outcomes in fields of major concen- 
tration will be more specific than those used to appraise general 
education outcomes, Nevertheless, they still should be focused on 
those aspects of specialization which are regarded as having per- 
manent value. Here, again, a decision must be made whether to 
use locally prepared tests or published tests. While published tests 
seldom incorporate all the subject-matter and other instructional 
objectives of any single college department and may incorporate 
content that is not taught, they do contain much that is common 
to many programs. Some colleges, then, may well sacrifice the 
measurement of some of the outcomes they consider important in 
order to obtain data which can be utilized for certain types of 
comparisons, as, for example, in respect to those who will enter 
certain graduate schools, the comparison of the achievement of 


their own majors in chemistry with that of students trained else- 
where. 


GROWTH STUDIES VERSUS STUDIES OF STATUS 
It has already been Suggested that at certain points in the stu- 
dents’ careers information is needed concerning their development 
in reference to specified educational objectives. This information 
may be useful in several different ways. Primarily, it will show the 


onstrate how far students have Progressed toward basic goals since 
first entering college. This question can be answered only if meas- 
urements of student performance are made prior to instruction 
and at appropriate intervals thereafter: at least two, and some- 
times several, measurements will be needed if the college is to 
assess the development associated with its program: one, as stu- 
dents enter college; one, midway in their Careers; one, an the 
terminal point of their formal education; and one, sometime after 
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and selecting appropriate instruments. Second, an appraisal of 
outcomes of general education should focus on the knowledge, 
skills, and understanding which are the outcomes of required gen- 
eral education courses, or outcomes which can reasonably be ex- 
pected from a college education, regardless of the specific courses 
or sequence of courses taken. Excluded are special outcomes, how- 
ever desirable they may be, which may logically be associated only 
with certain kinds of studies not required of all students. Third, 
the instruments constructed or selected must produce dependable 
and pertinent information on the attainment of these outcomes. 
Fourth, to obtain comparable data, identical tests or their alternate 
forms should be used at each point in the process. 

In addition to these general precautions, there are several more 
which are specific to the design of longitudinal studies. The study 
should include only those cases for whom complete measurement 
data are available since students who withdraw are likely to be less 
able than those who stay, and, unless they are eliminated from the 
group study, spurious gains will appear. For this reason, scores 
for freshmen who do not complete the sophomore year should be 
eliminated from the freshman data before freshman average scores 
are computed. Eliminating those for whom two sets of measure- 
ments are not available guarantees that the comparison is of the 
same students. Similar precaution should be observed for any later 
testings, 

The analysis of results in the longitudinal study will require at 
the very minimum the computation of averages on each measure 
used for the freshman, sophomore, and senior groups and of the 
spread of the scores on each of the tests used. In addition it should 
be established that any differences that occur between these aver- 
ages are not merely chance differences attributable to the unrelia- 
bility of the measure used or to the smallness of the group or its 
lack of representativeness. Therefore, it is highly desirable that a 
statistic known as the “significance of the difference” be computed 


for each difference being studied.’ 


2 Statistics texts should be consulted both regarding the computation of this 
statistic and its interpretation, since there are a number of assumptions involved. 
For example, see chap. iii, “Small Sample Error Theory,” in E. F. Lindquist, 
Statistical Analysis in Educational Research (Boston: Houghton Mifflin Co., 1940). 
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It should be borne in mind that often the error is made of 
establishing differences and the lack of differences too glibly on the 
basis of this statistic alone. It is a final step, to be taken only after 
the investigator is certain that other conditions of the research 
design have been fulfilled: all aspects of the change should have 
been considered, the measures used in evaluating the change 
should have been appropriate for each aspect, and such other 
factors as bias in the sampling, equivalence of learning oppor- 
tunity, testing conditions, and the like should have been con- 
trolled. Sometimes a difference which is described as “significant” 
in the statistical sense is so merely because some one factor has 
not been controlled; the explanation for the difference, then, does 
not lie in any intrinsic aspect of the instructional program but in 
some peripheral condition and should not then be presented as a 
difference which is significant in an educational sense. 

Assuming, however, that mean score differences are obtained 
that cannot be accounted for by chance or by extraneous bias, is 
this educationally significant? Not necessarily, Judgments of the 
educational importance of such improvements can be made only 
by the faculties concerned after a careful weighing of the test re- 
sults against their teaching objectives. In some instances, small 
mean gains from one year to the next may be quite defensible. 
If, for example, small mean gains occur in areas where there is an 
interest only in a minimum level of proficiency, there may be no 
cause for concern. Similarly, if entering freshmen perform at a 
substandard level of achievement in some one area, there is no 
need to be unduly disturbed if achievement in that area in the 
sophomore year seems less than that in other areas where the 
students were more advanced initially, 


There are instances when a college should be disturbed if stu- 
dents do not demonstra 


methods, and other factors affecting student progress, 
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Thus far, the position has been taken that information on stu- 
dent development can best be obtained by first establishing a refer- 
ence point at the freshman level and then obtaining second and/or 
third measurements at intervals thereafter. This is the ideal ap- 
proach, but it has the disadvantage of requiring time for results 
to materialize. But often delay is impractical, say, when reorganiza- 
tion of a sequence of courses is being considered and immediate 
information on present outcomes is required before changes are 
initiated. 

If this is the case, another approach may be useful, although it 
does have shortcomings in regard to measurement. This second 
approach assumes that the general caliber of students and their 
educational experiences in a given college and in a given span of 
time will not vary significantly—in other words, this year’s sopho- 
mores are similar to what the present freshman students will be 
after they have completed two years, and, similarly, the present 
sophomores two years hence will be comparable to the current 
senior class. On this assumption, suitable measures are admin- 
istered within the span of the same year to entering freshmen, end- 
of-year sophomores, and end-of-year seniors, and thus comparable 
data are obtained for study. This is known as the “horizontal” ap- 
proach. 

In general, data from a horizontal study should be treated in 
vay as in the longitudinal study, and mean scores 
should be computed and the statistical significance of mean differ- 
ences tested, The analysis is the same as for the long-term study, 
with this major difference: in addition to reviewing all the usual 
factors which might account for differences, consideration must be 
given to the possibility of bias due to attrition and to the possibility 
that some degree of the difference may be attributable to unequal 
vement levels of each class. In part, these risks can be 
minimized by matching the compared groups according to scores 
on scholastic aptitude and achievement tests which they took as 
freshmen. This precaution will not, however, control other factors 
that might affect one class but not another, such as changes in ad- 
ministration, faculty, and course requirements. When an important 
change of this kind has taken place, one of the basic assumptions 
for a horizontal study cannot be met. It is then ill-advised to use 


much the same v 


initial achie 
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the horizontal approach since limitations of the data and subse- 
quent difficulties of interpretation may prove insurmountable. ; 

There are a number of situations in which a college may find it 
practical to take a dual approach to evaluating student develop- 
ment. If there is an immediate need for certain kinds of evidence, 
the college may begin by collecting data on a horizontal basis: all 
the currently enrolled students at academic levels of concern are 
tested within the same academic year, the data studied, and con- 
clusions drawn. Thereafter provision is made to collect data on a 
longitudinal basis, with the freshmen being retested at the end of 
the sophomore year and again at the end of their senior year. 

The obvious advantage to this compromise approach is that the 
college obtains data immediately from which to make tentative 
hypotheses about student development and, yet, testing the same 
individuals as they progress through the program assures more 
dependable data for later assessment of growth. 


YEAR-TO-YEAR EVALUATION OF CLASSES AT THE 
SAME ACADEMIC LEVEL 
The two types of studies just discussed are primarily intended 
to show how much students develop in relation to certain objec- 
tives as a result of the program of studies. There are, however, 
other ways of looking at the test data collected from such studies— 
or for that matter from an 


the college conducts—which are useful in providing other kinds of 
information frequently 
is the year-to 
lar academic 


man class through college, 
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Comparisons of this kind help a college assess the effect of any 
specific change in its program. For example, when the humanities 
requirements are reorganized to provide for greater integration 
of its separate disciplines, are additional outcomes achieved? If 
other objectives are lost, is such loss justified in view of the new 
outcomes? Similarly, does the introduction of a new general edu- 
cation requirement in science create greater proficiency in an area 
of learning in which previous classes have been weak? Is this be- 
cause the test does not measure the outcomes for which the col- 
lege strives or does the course need further review and reorganiza- 
tion? Similar questions can be raised concerning other changes the 
college has made in recent years. 

Comparisons of student achievement from one year to the next 
can be useful, even if innovations in the program have not been 
made recently. For instance, for one reason or another, a college 
may be largely perpetuating the basic elements of its program. 
Experimentation, if any, may be left to the initiative of individual 
instructors. Over a period of years the college has, perhaps, been 
generally satisfied with the types of students it attracts and with 
their over-all level of performance. There have been only minor 
variations among classes from year to year. Suddenly a change oc- 
curs in one direction or another. There may be a drop in student 
performance on the sophomore measures although there is every 
indication that the class is of the general caliber of previous classes. 
Errors in administering and scoring the tests or in computing aver- 
age scores are always possible and, consequently, these operations 
should be re-examined. But, apart from errors, what factors might 
account for the change? Has there been a general easing-up of 
academic standards? Have students become more involved in cer- 
tain kinds of extracurricular activities than in the past? Or has 
there been a somewhat unconscious change in emphasis in the 
instructional objectives so that the tests used no longer reflect the 
content and goals of certain basic courses? In the latter case, again 
it must be asked, are these changes for the better or worse; that 
is, should the changes in the instructional program be maintained 
and new tests built or selected, or should the tests be kept and the 
instructional program brought more in line with what the college 
considers to be its educational objectives? 
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Again, a college may find that, although the general caliber of 
incoming students is gradually improving, student accomplish- 
ments at the end of the sophomore or senior year maintain the 
same level as those of previous classes. Since it is reasonable to ex- 
pect that the newer and more able students should perform at a 
higher level, there would be cause to consider why they are not 
doing so. Are the measures in use of sufficiently high “ceiling” to 
reflect a higher level of achievement? Are the students already so 
well prepared in certain areas that they are content to coast along 
on their previous background? Are there shortcomings in the in- 
structional program in stimulating these more able groups of stu- 
dents; if so, can these be identified in any way? 

Certain factors must be controlled before the answers to these 
questions can be obtained. Among these, the most important is to 
ascertain that the same tests or their alternate forms have been 
used. If in the interim there has been any change in the test used, 
comparisons cannot be drawn unless provision is made to equate 
the scores of the two tests. If this is not done, it cannot be said, 
for example, that a score equal to the published mean on Test X 
represents the same achievement as a score equal to the published 
mean on Test Y, for the subject matter of the tests, the difficulty 
levels of the questions, and the quality of the students used in 
standardization may all be different. Unless tables of equivalent 
scores have been developed, a college that has changed one or 
more tests in its program must accept the fact that the old and new 
data are not comparable and that, until there has been an oppor- 
tunity to collect data on the new instruments, the kinds of com- 
parisons discussed above are not justified. 

Over and above the consideration of comparable data is the 
need to determine the statistical and educational significance of 
score differences observed from year to year. Just as with the longi- 
tudinal study, it is necessary to determine both whether the differ- 
ences are greater than any that might be attributed to chance alone 
and how large a difference, either Positive or negative, is educa- 
tionally significant. While a simple statistical manipulation can 
determine the first type of significance, there are no formulae for 
determining educational significance: the importance of obtained 
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differences can be assessed only when reviewed in terms of all of 
the local educational factors. 


COMPARATIVE STUDIES AMONG DEPARTMENTS 


The kinds of comparative studies suggested thus far have been 
related to determining the effectiveness of such general institu- 
tional provisions as might stem from particular admissions criteria, 
modifications in program requirements, faculty up-grading, new 
facilities, and the like. The total educational program is, however, 
made up of parts, and the effectiveness of these individual com- 
ponents must also be studied. For example, it may be advisable to 


appraise the caliber of departmental instruction. While such stud- 


ies are obviously hazardous because of emotional overtones, and 


undoubtedly there are some members of the college community 
at they should not be undertaken at all, neverthe- 
aluation of the total institution. In such 
ve a factual basis for comparisons 
dgments either of the administra- 


who believe th 
less, they are part of any ev 
instances, it is much better to ha 
than to rely on the subjective ju 
tive staff or of the various departmental faculties. 

One approach to departmental comparisons is through re-use of 
results from a college-wide testing program that was primarily in- 
tended to evaluate achievement in general education. The stu- 
dents might be classified by department or by field of study and 
the test results of each group studied to obtain leads concerning 
areas in which instruction appears to be strong or weak. If ob- 
tained differences for these student groups have statistical signifi- 
cance and seem to be large enough to have educational importance 
also, certain other factors should be weighed before the conclusion 
is drawn that poor instruction alone explains underachievement 
or, vice versa, that some one department is doing an unusually 
fine job. Perhaps students of one department are required to take 
but a single course in the area while in other areas a sequence of 
several courses is required. Possibly, facilities vary greatly by de- 
partments, or the faculty-student ratios differ. The quality of 
groups attracted to the area in the first place is a most important 
factor. And conceivably the portion of the test used to measure 
outcomes in a certain area may be less appropriate to the institu- 
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tion’s objectives in that area than is true of the other fields meas- 
ured by the test. These and other questions must be answered be- 
fore final judgments are made but concomitantly the very process 
of raising and answering the questions is beneficial to all con- 
cerned and frequently will promote interdepartmental tolerance 
and understanding. 

Comparisons of performance in major fields of concentration 
also are difficult to make. An obvious reason is that tests designed 
to measure the specialized outcomes of different fields are not di- 
rectly comparable. How can the performance of a history major on 
a history test be compared with that of a physics major on a physics 
test? Each has a different background, and the tests have different 
content. It is like trying to decide whether one person is a better 
lawyer than another is a doctor. Since no absolute answer is avail- 
able, it can be said only that among lawyers this person stands 
high and among surgeons the other person is not among the top- 
tankers—a comparison of sorts but not one which justifies an un- 
equivocal conclusion. 

Standardization data offer another means of evaluating depart- 
mental achievement. If the tests used in the various departments 
have all been standardized simultaneously on the basis of results 
obtained through their administration to students in the same 
groups of colleges, this standardization group can be used as a 
reference point. Thus, if the average score of history majors 1n 
College A on the history test is higher than the score achieved by 
60 percent of the history majors in the standardization group, 
while the average score of the physics majors on the physics test is 
higher than the score made by only 20 percent of the physics ma- 
jors in the standardization group, it can be said, and should be 
useful to know, that the relative Standing of College A’s history 
majors is higher than the relative standing of its physics majors. 
However, before concluding that this is the result of better instruc- 
tion in history than in physics, some of the 
raised above in regard to quality of gr 
teacher ratio, and the suitability of the tests should be raised. The 
first and most obvious point to consider is whether the history test 
more closely reflects the content and emphasis of the major pro- 
gram in history than the physics test does in relation to the pro- 


same kinds of questions 
oup, facilities, student- 
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gram in physics. Presumably, the question about the test should 
have been raised before it was administered, but it is possible that 
the physics test actually is less suitable than it was originally 


judged to be. 


USE OF NORMS IN COLLEGE COMPARISONS 


The increase in student achievement on general education out- 
comes from the freshman to senior year, or the trend in achieve- 
ment over a period of years, can be determined from local data 
alone without recourse to normative data. Thus, a college cer- 
tainly need not use published tests for studies of this type if it 
believes that locally prepared tests more closely represent its cur- 
riculum. However, it may be desired to evaluate the effectiveness 
of an institution’s efforts through comparison of its students’ 
achievement with that ofstudents in other colleges. 

In the abstract, such studies have considerable merit because 
they offer colleges the opportunity to compare themselves with 
an outside standard of success. Theoretically, this should prevent 
a college from becoming too inbred in its efforts—a possibility if 
performance is reviewed solely in terms of local standards. In prac- 
tice, however, comparative studies are not quite so valuable, for 
at least two reasons. First, no absolute standard of success exists for 
all institutions, and, second, the very nature of the preparation of 
published tests presents measurement difficulties. 

In building tests for wide distribution, every effort is made to 
design tests in terms of the kinds of objectives and materials typi- 
cally included in the greatest number of college courses, but it 
must be recognized that the tests may not measure all the objec- 
tives that any individual college promotes nor reflect the same 
balance of offerings. It follows then that, although a test may repre- 
sent a college program in a number of important ways, it rarely 
fits perfectly; if it includes a number of areas there will be compli- 
cations of inequalities of appropriateness from subject to subject, 
with, say, the biology section corresponding well, the mathematics 
test less well, and so on. The extent to which the study can toler- 
ate some inappropriateness and still produce useful results must 


be decided by each institution. 
The second problem mentioned above relates to the norms 
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group upon which tests are standardized. Comparison with a larger 
norms group is necessary if a local college is to make nonprovincial 
judgments. But for these comparisons to be meaningful, the col- 
leges represented in the groups should have similar objectives and 
offerings, should enroll students of substantially the same level of 
scholastic aptitude, and should have similar educational and finan- 
cial resources. 

Quite obviously these conditions are rarely met. The norms 
group is seldom so completely defined that it is easy to assess the 
extent to which it represents a reasonable base for comparison. 
Thus, once again, a college must be prepared to determine the 
degree of compromise it can make in this regard and still profit 
by the experience. Because the factors of test content and the com- 
position of the norms group are never perfectly tenable, the data 
obtained from comparative studies of this type are always to a 
certain extent incomplete. Also, it is not always easy to work with 
data that have limitations of this type since additional local limita- 
tions can be obscured. For example, if a college fares well by com- 
parison with the larger norms group, there may be a tendency to 
feel oversecure and let the matter rest, when actually the norms 
Were an inappropriate base for comparison; or, if the comparison 
is less favorable, there is often a natural tendency to explain away 
the differences solely in terms of test content or of differences be- 
tween the local group and the norms group and, once again, to 
let the matter rest. In either case, a serious effort should have been 
made to understand why the results turned out as they did. If the 
performance seems superior to that of the larger group, surely it 
could be worthwhile to look at the local picture to determine what 
aspects of the program seem to be particularly strong so that they 
may be perpetuated and even improved. Or 
favorable, it could be beneficial to review th 
resources of the college in an attempt to i 
and then plan to strengthen the college in 

Occasionally, there may be no particula 
tices if the local results appear in a some 
with respect to the norms group. The local college may have some 
goals which are not universally held by the colleges represented 
in the norms group; all it may wish is to learn whether its students 


» if the results are less 
e present program and 
dentify reasons for this 
these respects. 

r need to modify prac- 
what unfavorable light 
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are reaching some desirable minimum level of proficiency in the 
goals measured by the tests. If this point is achieved, the college 
quite properly feels no compulsion to adopt the practices of any 
other college or group of colleges. Even in this situation, however, 
such comparisons may serve as a useful point of departure for 
a study of local objectives, policies, practices, and outcomes and 


will help to place them in some perspective. 


STUDIES OF NONCOGNITIVE FACTORS 

Most institutions are concerned about the development in their 
students of certain kinds of attitudes and appreciations, social and 
Vocational skills, worthwhile personal interests, and the like. 
Therefore, if a college hopes to achieve a balance in its evaluation, 
it should gather evidence also on how well such goals as these are 
being achieved. There have been some very interesting explora- 
tions along these lines, which are well worth attention. 

Some of the types of studies which have already been discussed 
are appropriate for measuring noncognitive outcomes. The major 
difference between them lies in the types of data-gathering devices 
they employ. Since most noncognitive outcomes fall in the realm 
of so-called intangibles, which are not adaptable to linear meas- 
urement, it is more difficult to locate or develop instruments which 
provide the same degree of dependability as do achievement meas- 
ures.® That this is likely to be so should not, however, deter a col- 
lege from undertaking an evaluation of noncognitive outcomes, 
for if the outcomes are defined as concretely as possible and the 
study suitably designed and executed, despite limitations, there 
will be results more indicative than uninformed hunches or opin- 
ions. But always it must be recognized that the data do have limi- 
tations. 

Some of the studies of a noncognitive nature may emerge from 
the materials of regular testing programs. For example, if fresh- 
man testing includes measures of personal qualities, it may be of 
value to ascertain from them what proportion of the students 
appear to have relatively serious adjustment problems. Similar 
information can also be obtained from admission applications and 


*For a general discussion of the types of instruments available for measuring 
Noncognitive factors, with a discussion of their major limitations, see chap. 5 of the 


present book. 


92 ROLE OF MEASUREMENT 


biographic questionnaires submitted by entering students. pe 
information thus gained may influence decisions regarding a 
kind of guidance services to be provided. Consideration of ad i 
tional data from other incoming classes, along with the review. i 
data from the guidance bureau, can be used in determining 
whether or not the services should be expanded or changed. Or, 
again, attitudes toward certain beliefs, groups, or institutions cal 
be assessed at the time of entrance a 
later to see whether efforts to inculcat 
have been successful. 

‘These and other similar sources of information provide general 
background data on the needs and strengths of students which 
can never be obtained from scholastic records and tests alone. Used 
in conjunction with the latter, data on noncognitive factors help 
to round out the picture of a group of students, sharpening the 
Srey areas and bringing particular features into focus. And over 4 
period of time, they present descriptions of student characteristics 
which both the instructional faculty and the administrative staff 


can use to improve student Services, curricular and extracurricular 
activities, and the like. 


The reader who is interested in 
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S f i iew 
e certain basic points of viey 


pursuing considerations in test- 
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erican Council on Education con- 
tains a discussion of the approaches u 


ing together under the direction of P 


© of the several reports 
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project. One of the most 
* Dressel and Mayhew, chap. viii, “Pervasive Objectives: Attitudes,” fyah 
Panli]. Bronwen Adenniying and Meeting Common Needs,” Student Personnel 

Services in General Education (Washington: American Council on A 
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recently published works in this area is a study by Philip E. Jacob 
of the impact of their education on the values held by college stu- 
dents at the time of their entrance to college. Also, there are bib- 
liographies of personality inventories. One source for locating 
these is the issues of the Review of Educational Research on 


growth, development, and learning.” 


ALUMNI SURVEYS 


All the foregoing approaches to institutional evaluation share 
one major limitation: they provide information on student devel- 
opment only for a narrowly defined period of time—the two- or 
four-year academic sequence. In the final analysis, what its grad- 
uates are like may be much more indicative of an institution’s suc- 
cess in accomplishing its major purposes. Thus, efforts to evaluate 
effectiveness are not complete until a serious attempt has been 
made to obtain further information from alumni about those out- 
comes which the college considers to be of special importance. 

What is pertinent information will vary from one institution to 
another, but all will be concerned with obtaining information on 
how former students appear to fulfill major goals in intellectual, 
Vocational, social, and personal areas. For example, if a college 
emphasized preprofessional training, it should learn what propor- 
tion of alumni complete advanced training. Similarly, it should 
know how many are placed in their chosen fields and what posi- 
tions they hold. Information on the scope and depth of current 
reading habits and other activities, both recreational and civic, 
also indicate to a degree success in developing worthwhile under- 
Standings and appreciations. And graduates’ opinions, in retro- 
spect, about the education which they received, or certain features 
of it, are worthy of consideration. 

Most alumni surveys have been based on data gathered through 
interviews, questionnaires, or both. Tests would also be useful in 
measuring the extent to which certain intellectual outcomes have 


° Philip E. Jacob, Changing Values in College (New York: Harper & Bros., 1957). 
A digest appears in the NEA Journal, January 1958. 

1 American Educational Research ‘Association, “Growth, Development and Learn- 
ing,” Review of Educational Research, 15: 473 ff, December 1955 (most recent 


issue). 
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been retained or improv 


ed, but obviously there are practical prob- 
lems invol 


ved in asking alumni to take tests. 

The questionnaire can be subject to considerable misunder- 
standing when questions are not clearly phrased. The interview 
circumvents this difficulty to a large extent since the interviewers 
can make certain that the interviewees do understand. For this 
reason and because a greater proportion of the alumni approached 
are likely to cooperate, interviews give greater assurance of usable 
responses than do questionnaires. But the interview is usually 4 
much more costly procedure. Questionnaires are therefore more 
generally used, and so it may be pertinent to discuss here some of 


the ways in which their disadvantages can be lessened. 


First, alumni should be told the basic purposes of the question- 


ng statement aimed at rousing thelT 
Id suffice. Also, they should be as- 
treated in a confidential manner. 
jor purposes of the study, the question- 
d uncomplicated as possible. Ques- 
everyone will understand their in- 


vance with a small group. 

A decision must be made 
cluded in the study. For s colleges the study 
lleges with large alumni bodies 


do. In either case, returns must be 
large enough to assure a dependable interpretation.’ If the ré: 


sponses of the alumni in general are to be studied, a smaller num- 


however, the design of the 


; ; es by certain subcategories, 
such as alumni of five years standing 


ë For bibliography on sampling, sce Francis G. Cotia; ies pee 
cation,” Review of Educational Rescarch, 14:359, December T Surveys in 

° For considerations in sampling, see Helen M, Walker and Joseph Lev, Statistical 
Taperence (New Mork: MeAty HOlw KCO 1953), which includes a section on sampling 
and survey design. 
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tionnaires are discouragingly slow and small. Responses will in- 
crease with the amount and kind of additional follow-up. Some 
alumni surveys have produced returns varying from 50 to 74 
percent. 

In analyzing data from questionnaires, consideration must be 
given to the possibilities of bias in the replies. Did proportionately 
more men than women reply? More local alumni than alumni at a 
distance? More younger than older alumni? More who graduated 
in the upper half of their classes than in the lower half? And so 
forth. When such questions are not satisfactorily answered, inter- 
pretations should be qualified accordingly. 

Analysis with no attempt to relate to factors known about the 
alumni as students can be quite useful in obtaining evidence on 
how well certain long-range objectives have been attained. But 
analysis can be strengthened if responses are categorized by pat- 
terns of characteristics or experiences that the alumni shared as 
students and provide an opportunity to relate their present status 
to past educational experiences. Analysis may reveal experiences 
which seem to correlate with the attainment of certain objectives, 
or the converse, and help thereby to support judgments concern- 
ing the adequacy of certain basic policies and practices. 

The reader planning an alumni questionnaire will certainly 
find it helpful to examine several alumni questionnaire studies. 
Some questionnaires that have been published or could be ob- 
tained are those of: the Univeristy of Syracuse,’ the University of 
Minnesota, the Woman's College of the University of North 
Carolina,”? and Chatham College.”* 


EVALUATION, A CONTINUING PROCESS 


From time to time changes occur in the student body, the 
faculty, the curricular offerings, the leadership, and the general 


»C, Robert Pace, “Follow-up Studies of College Graduation,” in Growing Points 
in Educational Research (Washington: American Educational Research Assoc., 1949). 
“C, Robert Pace, They Went to College (Minneapolis: University of Minnesota, 


1941). 4 
Woman's College of the Univ 
(Greensboro, N.C.), April 1956, pp- 12 f. 
3 Office of Evaluation Services, Bulletins; 
burgh 32, Pa. 


sity of North Carolina, The Alumnae News 


available from Chatham College, Pitts- 
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academic atmosphere of an institution. Whether these changes are 
the result of systematic planning or circumstance, the institution 
should be aware of them and prepared to measure their effects. 
But it is difficult, if not impossible, to do this when no information 
exists on the status of students before and after the occurrence 
of the change. Infrequent and sporadic efforts at evaluation do 
not produce the required continuity. Therefore, to be effective in 


facilitating institutional study, evaluation must be continuous and 
according to plan. 


7. Administering the Testing Program 


THE EXTENT AND COMPLEXITY OF ADMINISTRATIVE ARRANGEMENTS 
for test programs depend on a number of factors, the first of which 
is the educational philosophy of the institution—how it views its 
responsibilities to students and how many special services depen- 
dent on testing it believes it should provide. 

The institution that offers only a program of freshman tests in 
order to gain general background data on the group may manage 
ely with limited advisory services and the 
part-time attention of a faculty or administrative staff member. 
This arrangement is typical in institutions that initiate modest 
test programs, usually with the intention of gradual expansion. In 
contrast, institutions having highly developed evaluation programs 
and large universities that are naturally complex in organization 
will utilize a very large staff of administrators and specialists to 
engage in a wide variety of evaluation activities. 

Currently a number of kinds of administrative organization for 
evaluation programs are in use. A few institutions centralize ad- 
ministration in the hands ofa specially trained evaluation officer 
who receives his authority from a dean or the president. He prob- 
ably uses a faculty advisory committee in connection with each 
perhaps, a single general advisory com- 
This kind of plan provides the best 
t some institutions, a mem- 


quite well administrativ 


phase of his activities or, 
mittee in over-all planning. 
opportunity for general coordination. A 
ber of the education or psychology department carries a similar 
responsibility. At still others the responsibility may lie with a fac- 
ulty committee whose chairman assumes direction, usually with a 
concurrent reduction in his teaching load. A complex setup is 
found where there are boards of specialists, with one over-all offi- 
cer (who may be one of the administrative officers of the institu- 
tion or an expert trained in some phase of evaluation), teachers 
who assist test specialists in building achievement tests, a person 
who specializes in vocational testing and advising, a psychometrist 
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who administers supplementary tests to individuals, a statistician, 
and other miscellaneous staff. Sometimes the personnel of each 
board represent chiefly the interests of a particular unit of the 


institution, such as a school of engineering or an office of admis- 
sions. 


Organization for ev: 
with one board or per: 
decentralized with ea 


aluation, then, can be completely centralized 
son responsible for test activities; completely 
ch unit—school, department, and so forth— 
Operating independently in designing and conducting its own pro- 
gram; or decentralized, with certain services being handled by one 


officer and latitude given to individual units to develop supple- 
mentary programs. 


the procedures 
nstitution. To 
n keeps abreast 
unit—however 
advisory com- 
onnel to help 
plans, suggest areas need- 


and havea budget. 
Because certain functions of th 


Regular functions 


The nature of the test activities described in 


= : Previous chapters 
indicates the regular functions of the person w 


ho carries responsi- 
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bility for testing in a comprehensive program: (1) supervising test- 
ing programs for purposes of admissions, guidance, and institu- 
tional evaluation; (2) assisting the faculty to improve their meas- 
ures of the results of instruction; (3) supervising testing programs 
that support particular educational policies or curriculum analysis, 
such as progression on the basis of achievement, admission to an 
upper division, selection for a graduate program, establishment of 
courses based on student needs; (4) informing faculty and students 
of outside testing programs such as the National Teacher Exami- 
nations; the National Science Foundation test programs; the 
Graduate Record Area Tests; miscellaneous city, state, and federal 
and premedical, prelaw, and other preprofes- 
sional tests; (5) selecting tests for special groups, say, for evaluating 
the progress of a group of students receiving remedial work in 
speech, or for individual students who can be given better guidance 
if specialized test data are available; (6) planning the analysis of test 
data needed for meaningful interpretations for all test activities; 
and (7) providing guidance and technical assistance in the con- 
struction of testing instruments for experimental use in instruc- 
tion. Clearly, the guiding officer must be competent in administra- 
tive matters and either be professionally trained in evaluation or 
have advisory or staff resources to meet these needs. 


civil service tests; 


Special functions 
the administrative officer or committee 


the reproduction of test copy, the 


also supervises, for example, 
arrangements for final examinations and make-up examinations, 
‘s, arrangements for administra- 


the scoring of examination paper 

tion of noninstitutional tests as a form of cooperation in national 
standardization programs, and perhaps the administration of tests 
or scoring services for neighbor institutions. He, or they, may also 
be responsible for institutional studies that do not involve tests 
such as the collection of enrollment data for an anal- 
y—but it is not within the province of this 


In some institutions, 


or test data, 
ysis of student mortalit 
guide to pursue these aspects of the job. 


While the best arrangement is that which coordinates most of 
be shared by other 


these activities into one unit, some of them can 
vice may be pro- 


college or university officers, so that special ser 
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vided even without a special staff. In fact, this is probably the 


rule rather than the exception in the organization of evaluation 
activities. 


PROBLEMS OF TEST SELECTION 
The question of who should select published (as opposed to lo- 
cally constructed) tests de 
tests to be included in th 
tories and personality tes 
who has had considerabl 
guidance. However, it is 


However, since 


in E, F, Lindquist 


and one which is most helpful to anyon: 


the subject: statistical problems, formulae, 
sidered. 
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of the tests found in mental measurements textbooks and in the 
Mental Measurements Yearbooks.’ 

_ Usually, the predictive validity of tests is obtained by determin- 
ing the degree of correspondence between test scores and whatever 
one is trying to predict, frequently grade-point averages. The re- 
lationship found is most frequently summarized by a numerical 
index called the coefficient of validity.* A validity coefficient for a 
test is by no means a fixed statistic. It may vary with different 
classes and even with the way a course is taught in different years. 
An instructor who uses an objective test as a measure of achieve- 
ment of English composition skills may find the validity coefficient 
varying from year to year with growth of his own instructional 
effectiveness. However, it is reasonable to suppose that if he once 
developed a test with an acceptable validity coefficient for stable 
course objectives, that coefficient would not shift greatly in sub- 
sequent administrations to the same kinds of students. On the other 
hand, on a given test the coefficient used to identify students who 
need remedial reading and that used to identify students with high 
reading literacy could be quite different. For this reason the worth 
of a test as expressed by one or more validity coefficients must be 
considered in terms of the purposes the test will serve. 

Apart from the fact that a particular test might not be a valid 
predictor of success, validity coefficients may vary in size for several 
other reasons: the consistency with which the test measures the 
area in question, the size of the group studied, the range of ability 
within the group, the standard used to determine success, and so 
on. Since interpretation of coefficients is no task for the novice, 
advice of a trained technician must be sought for their meaning. 
Suffice it to say here that in selecting achievement tests, corre 
spondence between course objectives and test content is the pri- 
mary consideration, and should be determined by the instructional 


staff. 
Reliability 

A test, to be effective, should yield consistent results, so that 
the user can place confidence in the score a student obtains. That 
is to say, if this student were to take this test or a comparable form 


2 Oscar K. Buros (ed.), Mental Measurements Yearbooks (Highland Park, N.J.: 
Gryphon Press). The last edition is the fourth, published in 1953. 
s Cureton in Educational Measurement, pp. 680 f. 
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of it again, he would obtain a similar score or a similar rank order 
within the tested group. 

In general, tests (and their subparts) can reach a high standard 
of consistency if they are sufficiently long, cover a homogeneous 
body of material, are free from technical inadequacies such as 
ambiguity, and can be scored precisely. However, because there 


of a test is an especially im- 
to be used to evaluate the 
vidual student for diagnostic 
and guidance purposes. If the objective is to obtain a general pic- 
mpares with similar groups of 
meet the highest standards of 
test consistency, although it is desirable that it do so. 

ty of test scores is usually pro- 


more of several different ways. 
A most useful kind of report is that which provides information 


on the probability that if a student is retested, his score will fall 
not more than a certain number of Score points above or below 


e. This is a statistic known as 
the “probable error of measurement,” 


Since no test is perfect 
he earns a slightly different score 


r ajority of 
„2 given point, which would be the 


somewhere between 49.5 and 54.5. The mo 
the smaller the range of scores.* 
“Naturally the difference must be related to the difficult 


i : : y of the test and i 
range of scores; a difference of 75 in a test with a range of 500 points is not is ee 
a difference of 75 in a test with a range of 125 is large. arge; 


5 Se eee :-- ~~ 
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The degree to which a test yields consistent results is also indi- 
cated by its coefficient of reliability. There are several ways of 
computing this coefficient.® Basically, it is a mathematical descrip- 
tion of the relationship between two sets of scores on the same test 
or alternate forms of the test when they are administered under 
identical circumstances. The closer this relationship, the higher is 
the reliability coefficient and the smaller are the chances that a 
given student’s relative position within a group will shift from one 
testing to another. As the reliability coefficient decreases, less and 
less reliance can be placed on a score as a measure of a student’s 
performance in respect to the abilities measured by the test. 

A general working rule is that if a test is to be used with an 
individual student, the coefficient should not fall below .90, al- 
though it is best that it be .95 or higher. For group comparisons 
and group prediction the coefficient of reliability should never be 
lower than .60. Ordinarily, however, it should be at least .80, and 
it is desirable that it be even higher. 

As with coefficients of validity, coefficients of reliability are not 
simple to interpret, since much depends upon understanding the 
assumptions involved in any of the several ways in which they may 
be computed and on the uses to which the test scores are to be put. 
At this phase of test selection it is usually best to seek the advice of 


a person with wide experience in testing. 


Difficulty 
In addition to the selection factors already discussed, tests 
should be evaluated in terms of their level of difficulty. If a 


test is either too easy or too difficult, the scores will not dis- 
(Discrimination is usually 


For most testing 
n the low to 
s clustering 


criminate adequately among students. 
the objective, although there are some exceptions.) 
purposes, scores for a given group should range from 
the high end of the scale with about half the score 
around the middle. 

The difficulty of a test is determ 
questions in terms of the types of materials teachers believe 
students are able to handle. Evidence of how the test has functi 
for other groups is also helpful. In any event, just as with the 


ined by an analysis of the test 
that 


oned 


*See Robert L. Thorndike, “Reliability,” in Educational Measurement, p- 560. 
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validity of the test, one should not rely wholly upon the state- 
ments made in the test manual. This does not by any means imply 
that test authors are unscrupulous in their reporting. It simply 
means that truly representative groups exist chiefly in theory—what 
is easy for one group may be quite difficult for another. 


Adequacy of norms 


The raw score of a test is simply the number of correct re- 
sponses, although it may also be the number of correct responses re- 
duced’ by the number of wrong responses or some fraction thereof. 
Although parts of scores are sometimes weighted, if the items are 
homogeneous and sampling of the field has been comprehensive, 
weighting adds little, if anything, to the result.” From the raw 
scores and their statistical manipulation, the examiner can ascer- 
tain a student's relative position in his group, 
the average of the group, the general difficulty 
relation with any previous or later tests used, a 
in simplest terms, the classroom teacher who is a 
ing tests and has general knowledge of his stu 
information from raw scores by means of rank 
determining deviations from the average, 

When test results for larger and les: 
for a whole class of sophomores 
formance of students is to be 


his deviation from 
of the test, its cor- 
nd the like. Thus, 
ccustomed to writ- 
dents gains further 
ing, averaging, and 


used, the raw scores must be cony 
measure” is employed by the test P 

Norms are most commonly re 
a means of relating gross ranki 
students in one hundred earni 


erted into whatever “unit of 
ublisher.s 


ported in terms of percentile rank, 
ngs of scores to the proportion of 
ng Scores below a given point. Ac- 
° For a discussion of the problem of correction for 


essing, . A 
tional Measurement, pp. 347-51. Buessing, see Traxler in Educa 
7 For a discussion of the considerations in weighting, 
Measurement, pp. 369-71. 
®See John C. Flanagan, “Units, Scores, and Norms,” in Educati 
pp. 695 f.; or see E. F. Lindquist, A First Course in Statistics Giew toa eee 
Mifflin Co., 1942), pp. 145 ff. : Houghto: 


See Traxler in Educational 
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cording to this practice, a student with a percentile rank of 34 has 
a score that has been exceeded by 66 percent of the students in the 
normative group. 

Raw scores may also be converted into “standard score units” 
to establish their positions above and below the average (or mean) 
score. This is done, initially, by giving the test to a large group of 
students (normative group) and computing the average score, as 
well as the “standard deviation,” which summarizes the way in 
which the scores distribute above and below the mean. By appro- 
priate equations the raw scores can be converted into a convenient 
standard scale. All later raw scores are also corrected to the same 
scale and have meaning in terms of the performance of the origi- 
nal group tested. These scores are called standard scores or scaled 
scores? 

But, again, the user of this or any other scale needs to be aware 


of the characteristics of the normative group. For example, if a col- 
ho are from families educationally under- 


lege serves students W. 
ults from a 


privileged, say from an impoverished town and uses res 
published general aptitude test of good reputation as a basis for 
admission, it will probably not use the cutting points or the percen- 
tile norms provided in the test manual since the normative group is 
of a different caliber. Provided the examination content does not 
penalize the educationally underprivileged group in particular and 
provided the scale is divorced from other populations and new 


norms are established on the basis of the underprivileged group 


or other groups like them, it is proper to utilize the examination 


and its scale.?° 

Obviously, 
upon whose test performance the scales are com 
important. If this group is biased in any way, comparing the per- 
formance of the local group with the normative group is inappro- 
priate. Furthermore, in utilizing the norms, the user must ascer- 
tain that the norms population is one with which it is reasonable 
to compare one’s own students. Local data will help the college 


see Flanagan in Educational Measurement, 


the original sample of students (the norms group) 
puted becomes all- 


° To study the problem of scaling tests, 
T P 


chap. 17. 
For a discussion o 
see Allison Davis, Social-Class Influences Upon Learning (Cambridge, Mass.: 


University Press, 1948). 


f the inadequacies of some tests for underprivileged groups, 
Harvard 


106 ROLE OF MEASUREMENT 


evaluate how well a student compares with students who have had 
identical instructional opportunity in its own institution or, in 
the case of an entering group, with students of the caliber selected 
for, or patronizing—as the case may be—the college. Thus, the 
lack of relevant norms need not prevent a college from using a 
test which is otherwise well suited to its purposes." 


Comparable forms 


Tests that are published in two or more comparable forms lend 
themselves to more uses than tests that have but one form, for stu- 
dents may be retested to measure progress, breaches of test security 
are less likely to occur, and, in special cases where scores obtained 
by students seem open to question, the results may be checked by 
administering another form. : 

When there is a need to select tests with comparable or equiv: 
alent forms, it is desirable to examine the tests to determine 


whether they are indeed comparable. (It should be borne in mind 
that no amount of statistical 0 


ing different objectives comp: 


objectives and content befor 
meaningful. 


peration can make two tests embody- 
arable; they must have basic common 
€ statistical equating of scores can be 
) Questions to be reviewed are: Do both forms cover 
the same content and objectives? Have various types of materials 
received approximately the same emphasis? Do the test items seem 
to be of equal difficulty? Can results on both forms be interpreted 


similarly? In addition to examining the test itself, the opinions of 
experts who are familiar with the technical problems inherent in 
developing comparable forms are useful 
some of these questions as is, 


manual and in the literature. 


in obtaining answers to 
again, the evidence published in the 


Completeness of manual 


If the test is accompanied by adequate inform 
its development, validity, and reliability; 
for administering and scoring; 


ation concerning 
if it has clear directions 


“A very clear discussion of some of the fac 
and the difficulties of publishers in supplyin 
found in Test Service Bulletin, No. 39 (New York: Psych 
1950); available upon request from the Corporati 


tors which operate to affect norms, 


y suitable norms, will be 
'Sychological Corporation, May 
on, 522 Fifth Ave., New York, N.Y. 


8 gencrall 
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norms are provided, one may say that a test has a complete manual. 
This material should be organized for easy use. Also it should 
contain information on skills and abilities that the tests and sub- 
tests measure, on the difficulty level of each item, and suggestions 
for the teacher, counselor, or administrator that will help each 


make the fullest use of the test results. 


Miscellaneous points 

Other points to consider in selecting a test are (a) cost, both of 
basic materials and of scoring and interpretation; (b) format of 
the test; (c) the probability of its future availability, if a long-range 
program is contemplated; and (d) the feasibility of administration 
of the test under whatever local conditions of staff and division of 


class periods obtain. 


THE MECHANICS OF TESTING 


In embarking upon a program of evaluation, the sponsoring 


group should consider problems of the mechanics of the operation. 
It will want to ascertain, in particular, that provisions are made 
in advance for administration, scoring, test reproduction, the anal- 
ysis of the results, and reporting to faculty and students. 


scoring, and reproduction 


Needless to say, administration should both be planned care- 
fully in advance and carried through in all its aspects with meticu- 
lous attention to detail. Inexperienced examiners will do well to 
review the manuals of some of our better tests that give explicit 
instructions on administration, many of which are universal in 
application. The long experience of the College Entrance Exami- 
nation Board in conducting its examinations has resulted in an 
excellent manual entitled Supervisor's Handbook; it may be pur- 
chased from Educational Testing Service for a nominal sum.” 
Traxler discusses test administration in detail in Educational 
Measurement. For inexperienced examiners these two items will 


be very helpful. 


Administration, 


1! Address: Educational Testing Service, Box 592. Princeton, N.J. 


2 Pp, 329 ff. 
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Accurate scoring of tests is also requisite to valid and usable test 
results. Whether the tests are hand- or machine-scored, there are 
good and bad practices relative to such matters as preparation of 
the most suitable scoring key, handling of keys, rechecking for 
accuracy, recording the results, and so on. Since these have been 
described by Traxler, they need not be recounted here. 

Test reproduction has also been fully treated elsewhere, by 


Spaulding, who has described the various types of reproduction 
processes suitable for tests. 


Statistical treatment of results 


A testing program that has reached the analysis stage is nearing 
the point where it will begin to serve the purposes for which it 
was established. However, in order to bridge the final gap between 
meaningless numbers and useful descriptive information, the data 


must be subjected to further systematic treatment. Such questions 
as how an entering class co 


stitutions or with another 
student’s academic strengt 
vised natural science pro; 
and the like, can be consi 


mpares with classes entering other in- 
year’s entering class, what a particular 
hs and weaknesses are, whether a re- 
gram has achieved expected objectives, 
dered only after the data are statistically 
analyzed and interpreted according to some defined reference 
standard, such as local or published norms. 


Thus, in almost al] instances there will be a need to determine 


the score which describes the average performance of the group 
tested. In addition to thi 


s statistic, whether it be the median or the 
mean, information on th 


“ Ibid., pp. 329-68, 
1 Geraldine Spauldin; 


g “Reproducing the Test,” 
pp. 417 ff. 


in Educational Measurement, 
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purposes will point up both the kinds of information that can be 
gained and their importance. This discussion will necessarily be 
sketchy since descriptions of statistical techniques fall outside the 
scope of the present work. 

The admissions program uses several kinds of information— 
test scores, rank in class, and principal's rating. Since there is 
normally a need to compare the standings of candidates on the 
various criteria, more precise predictions of success can be ob- 
tained if the relationship between the selection criteria and grade- 
point averages is studied. Local norms, percentage of agreement 
among the criteria and between the criteria of selection and the 
eventual outcome in terms of student success (coefficients of cor- 
relation), and expectancy tables based on this agreement are es- 
sential, though these need not be computed annually when criteria 
are well established. 

Guidance, like selection, involves prediction, and for imme- 
diate purposes the primary point of departure is the local com- 
petitive group. All the data needed for selection purposes are 
useful as well as data showing the relationship of ability and pre- 
vious achievement in specific programs of study. The test results 
will eventually be used to help students understand their skills 
and abilities; thus, individual profile charts which plot strengths 
and weaknesses are helpful. For other guidance purposes, such as 
interest testing, comparisons with larger norms groups are ad- 
visable. 

The use of tests in course placement is in many respects like 
that in admissions. It differs in that predictions are made within 
very narrowly defined areas—the probability of success within a 
specific course. Local data are all-important. Test scores should 
be related to success in the course, and experience or expectancy 
tables are helpful in establishing appropriate cutting points. In 
some cases, placement will not involve prediction as much as ap- 
praisal of current status: How does this student compare with 
students who have successfully completed a given course? Here 
scatter plots which compare test scores with grades or other quali- 
tative descriptions of successful students in the course are rele- 


vant. 
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Instructional uses of tests, like the purposes thus far mentioned, 
require primarily reference to local data, How the class as a whole 
performs on the test and the range of proficiency represented are 
the points of major concern. Measures of central tendency and 
scatter will be needed. On occasion, comparisons with published 
norms will be useful, as may be annual comparisons between 
classes. Analysis of subscores within the test will show the extent 
to which teaching objectives in a course are being met. Item 
analyses will reveal the difficulty of particular categories of ma- 
terials and also be useful in improving the test for future use. 
Sometimes comparisons of gains between pre- and post-tests will 
be required. At other times the need will be for special studies 
to evaluate the success of one teaching approach over another. 

Institutional surveys for planning and curr 


ricular purposes make 
greater use of published norms since comparisons from one group 


the statistical significance of 
be ascertained, which neces- 
ard deviation as a first step.” 
ar charts or histograms may 


Although the above discussion is sketchy, the general methods 


for certain approaches have been suggested in the specific chap- 
ters dealing with the major testing purpose. A beginner should 
not attempt to put any of the methods into use until he has re- 
ferred to some standard text in statistics, 


Above all, whatever measureme 
meaningful to all concerned b 


plans, by enlisting their assista and by re- 


his usually 
s to clarify 


em. It un- 
enthusiasm 
gged down 


ysis in Educational Research 
Scusses methods of computing ar 


doubtedly means maint: 
for the objectives of ea 


Houghton Mifflin Co., 1940), pp. 54 ff. Di 
preting significance of mean difference. 


(Boston: 
nd inter- 
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by what can often become a bewildering mass of data. It means 
also remaining sufficiently alert to the needs of officers, groups, 
and individuals within the institution to be in a position to re- 
late available instruments or data to their current problems. 


REPORTING TO STUDENTS AND FACULTY 

When a report is made to a student, it is advisable to give him, 
not only his raw scores, but as much of the following information 
as possible: (1) He should have some point of reference, such as 
his local class or group or a preceding class or group, with which 
to compare his own scores; it may be possible to do this by using 
a simple distribution of scores and by marking his position in the 
distribution. (2) When it is relevant, he should be able, similarly, 
to compare his scores with those of students at other institutions. 
(3) In certain tests, it will be profitable for him to compare his 
scores with his own performance at a previous time or to compare 
his relative achievement on the various tests and their subparts; 
this is more complicated, requiring, as a minimum, percentile 
scores referred to some basic group Or the use of some form of 
standard score. Usually the purposes of the tests should be re- 


iterated since students quickly forget such things. 
A report to the instructional staff concerned with the test should 


begin with a review of the purposes for which the test was ad- 
ministered and indicate to what extent the purposes were realized; 
in addition it should tell them what they can expect to learn from 
the data as they are presented, what the salient facts are about 
local results and about the results obtained elsewhere, and how 
much statistical significance can be attached to the averages and 
to the individual scores in terms of the concept of variability in 
scores.'® Sometimes graphic presentations are more readily grasped 
than tabular data; these are discussed in standard texts.** 


‘ducational Measurement, pp. 722 ff., discusses test scaling and 
e best scaling method for a particular situa- 


tion; additional bibliography on standard scores is given at end of chapter. 

See any standard textbook for discussions of test score variability. The reader 
may consult, for instance, Henry E. Garrett, Statistics in Education and Psychology 
(4th ed.; New York: Longmans, Green & Co., 1953), chap. iii; or Lindquist, A First 
Course in Statistics, pp. 69 fE; or Karl J. Holzinger, Statistical Methods for Students 


in Education (Boston: Ginn & Co., 1928), pp- 101 ff. 
2 Holzinger, op. cit., chap. iv; and Lindquist, A First Course, chap. iv. 


"Flanagan in E 
various procedures for identifying th 
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A well-conceived and 
though of modest dimension 


PART Il 


Descriptions of Selected College and 
University Testing Programs 


introduction 


IN THE DESCRIPTIONS OF SEVEN PROGRAMS OF TESTING WHICH FOL- 
low, there is much that illustrates points made in the foregoing 
pages. Each contribution, however, was written independently of 
the main text, and their relationships, therefore, are somewhat 
coincidental. 

Some of these programs have developed over a | 
years and some are relatively young. A variety of forms of or- 
ganization for the administration of testing programs is repre- 
sented. Of particular interest are the descriptions in each of how 
tests are used to implement different aspects of its educational 
planning. While all use tests to serve certain purposes, such as 
admissions and placement, the fact that they use different kinds 
of tests to do this and the variety of ways in which each operates 
in the use of tests beyond these purposes reflect a versatility which 
it was thought would be provocative. 

The presentations are from institutions of different types of 
organization and of different-sized enrollments. Among them will 
be found descriptions of the programs conducted at a small pri- 
vate Eastern liberal arts college for women (Chatham College); 
a coeducational liberal arts college within a large private Mid- 
western university (University of Chicago); a larger private East- 
ern liberal arts college for men (Dartmouth College); a coeduca- 
tional college of liberal arts within a municipally supported uni- 
versity (University of Louisville); the program of a counseling 
center of a very large Midwestern state university (University of 
Minnesota); and the programs of two Western institutions, one a 
large city-supported public junior college (Pasadena Junior Col- 
lege), and one a state-supported college of liberal arts (San Fran- 


cisco State College). 


ong period of 


8. The Testing Program of 
Chatham College 


LILY DETCHEN, Director of Evaluation Services 


RELATIVELY FEW SMALL COLLEGES MAINTAIN OFFICES THAT ENGAGE 
in comprehensive evaluation activities, chiefly because most such 
institutions believe that they cannot afford these services, This 
point of view is unfortunate, for it is not difficult to demonstrate 
that in the long run the provision of evaluation services is eco- 
nomically sound. Such an office can assist a faculty in their instruc- 
tional responsibilities; a better instructional impact generally 
means better satisfied students and a higher retention rate. The 
selection of students who are scholasticall 


y able and eager is in 
itself an economy, 


considering the time and energy consumed by 
recruiting and retention activities in most private colleges. The 
provision of data that facilitate initial approaches to students and 
that assist with plans for their academic and general welfare surely 
represents an economy of effort for the staff and a more com- 
fortable state of mind for a student, eliminating, as it can, many 
false moves on repetitious courses, placement at a wrong level, 
too heavy or too light an academic load, and so on. If evaluation 
activities accounted annually for the retention of only six students 
who might otherwise quietly fold their tents and slip away, they 
pay their way. 

Not only can the small college afford such an activity, but in- 
deed it should and must afford it, for this may mean its survival]. 
The administrator of the smaller college who thinks 
“afford” the continual, systematic study and review of 
program and student body, entailing necessarily so 
facilities and a professional staff member to plan w 
his faculty, is not only maintaining a static educatio; 
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but also is foolishly losing students and, hence, income. Such a 
program can have its finest opportunity in the smaller college. 

While it is granted that the large university can do many things 
better because it does have more specialists, the most compre- 
hensive, useful, and personal interpretations are possible, I venture 
to say, in the informal, relaxed, and administratively uncluttered 
atmosphere of the smaller institution, provided that institution 
wishes to help its students to that degree. 

Chatham College is a small liberal arts college of approximately 
450 students and 65 full-time-equivalent faculty and administra- 
tive staff. During the past decade the college has instituted a series 
of curricular changes. It has been deemed important to maintain 
an analytical climate in respect to them, and simultaneously to 
keep planning abreast of the best research and evaluation practices. 
The Office of Evaluation Services was established ten years ago to 
assist in analyses of the curricular changes that were being intro- 
duced at that time and to promote a spirit of inquiry about the 
total educational venture. It is not our purpose to discuss here 
the total evaluation program at Chatham College but rather the 
current uses that it makes of tests and relevant considerations. 

At Chatham the staff for this activity consists of a director 
trained in evaluation, a secretary-assistant with no previous tech- 
nical training in testing, and twenty-five hours weekly of student 
clerical aid. Occasional examiner assistance is recruited from the 
admissions staff and a clerical allowance of $200 permits some extra 
occasional help. An IBM scoring machine and an electric cal- 
culator are necessary to the operations. Except during Freshman 
Week, no more is undertaken than can be handled by the office 
staff. It should be mentioned that the reproduction of examination 
Copy, a sizable item, is managed by another unit. The office oc- 
cupies a 20-by-27-foot area, including built-in closets for storing 
tests, and is glass-partitioned into three units. Adjacency to other 
administrative offices gives easy access to student records; location 
in the principal classroom and _teacher-office building increases 
its efficiency and accessibility; a nearby conference room is avail- 
able. Annual operating cost is about $11,000 for 450 students; 
there should be but small increase in cost for as many as 600 stu- 
dents. Expenses are borne by the office budget; none are charged 
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against departmental or other special budgets but are considered 
as part of the total instruction budget for the college. 

The selection of tests for the Chatham program may be 
influenced in any particular instance by several factors. A primary 
consideration is that no more tests are administered than can be 
utilized in terms of purposes and staff. Some tests are chosen, not 
because they are the very best instruments available in that area, 
but because they complement economically some other examina- 
tion also being used. For example, someone reviewing the list 
might ask why a more diagnostic reading test is not included. 


Earlier such a test was used. More recently this has not been done 
because the current opinion is that 


» in the face of other needs, 
readi 


ng difficulties among our rather select group of students are 
not sufficiently prevalent to justify the employment of a reading 
expert, and until such time as we think they are, diagnostic read- 
ing testing is wasteful. It is much more important that we know 
whether students can, under the more reasonable and generous 
time allowances that duplicate their study and examination con- 
ditions, do analytical and interpretational reading, with accurate 
comprehension of material of standard difficulty. Testing for that 
objective is emphasized, and several tests of varied subject content, 
which also serve additional purposes, are used to obtain that in- 
formation. 

We also like to include some tests which do not impose rigorous 
time limits. While superior students may be expected to perform 
well under speeded conditions, some quite able students do not 
concentrate to the best of their ability under pressure. Since our 
college course examination periods are generous and, we hope, 
relaxed, there seems to be more sense in judging the performance 
of students tested under more leisurely conditions, 

Sometimes we select and use an examinati 
pose of an over-all appraisal of a group thar 
individuals in the group. The performance 
aptitude test that has been administered t 
uniform conditions of motivation and environment provides a 
gauge that has many general research and evaluation uses. W 
therefore, to supplement the required College Board Sch 


on more for the pur- 
n for the appraisal of 
of all freshmen on an 
o them locally under 


e like, 
olastic 
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Aptitude Tests with such an administration, and we use the School 
and College Ability Tests for the purpose. 

Frequently, it is necessary to weigh the substitution of a new 
and somewhat better examination for one that has had several 
years of use in our own college and the performance of which, 
therefore, we already understand. We then introduce the new 
test but retain temporarily some of our regular battery until we 
establish the validity (for us) of the newer instrument. 

Also, we bear in mind that no test is equally suited to all stu- 
dents, varying as they do in quality; administration time require- 
ments, if nothing else, prohibit the publication of examinations 
with such comprehensive validity and reliability. For example, we 
have found that, of the tests we use, the American Council Psy- 
chological Examination, the Scholastic Aptitude Test, or the 
School and College Ability Tests will do a better job of identifying 
the brightest in our groups than will the high school level General 
Educational Development social studies test, which is more useful 
for identifying the academically weaker student. 

Where there is an interest only in a rough screening of students, 
which occurs when we sift for those students who should try ex- 
emption examinations, we use brief examinations in preference to 
more time-consuming ones of higher statistical reliability and dis- 
criminatory validity. Provided we place our cutting points prop- 
erly, we know that we have culled a group which will contain the 
students we really want to identify. To find eight students, say, we 
may have to subject twenty-five to a rigorous examination session, 
but one hundred and twenty-five can be excused. 

Above all, we keep in mind that while a student cannot perform 
better in a test than she really is, her performance at a given time 
may not show her real attainment; she may have emotional insta- 
bilities that loom at test times, or be temporarily distracted by ill- 
ness or problems, or not be motivated to do well. Therefore, for 
some individuals it is well to use a number of examinations admin- 
istered at different times, with follow-up testing in questionable 
instances. For example, for most of our applicants we receive, 
from their high school, records of several IQ testings—along with 
records on other achievement tests obtained at different ages. In 
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addition we have a College Board SAT and Achievement Tests 
report. These may all be well correlated, roughly, with the high 
school grade record. If so, we can rejoice and proceed with evalua- 
tion in earnest. But when there is a serious disagreement among 
some of these data, the applicant is retested either at the college or 
at the high school. No fixed set of tests is utilized; we select those 
tests that seem to cover best the applicant’s preparation or that 
help resolve the discrepancy in the data already at hand. We may 
thereby salvage annually as many as ten desirable students who 
might otherwise be rejected. In this process, we select tests that 
we believe can be administered by someone relatively inexperi- 
enced in giving tests. 

There are other considerations, but the above serves to illustrate 
the point that tests cannot be blindly accepted or naively cate- 
gorized. Even the tests of the most reputable research teams have 
limitations in a given situation. Do not misunderstand me. I con- 
sider testing to be an invaluable tool, But there are few, if any, 
tests with built-in, ready-made solutions. The acceptance of the 
role of testing is, in general, threatened, not so much by testing 
instruments that claim to do what they do not do, as by persons 
who hope that they can do what they cannot, because life would 
be so much simpler if one could but accept the infallibility of 
testing. Judiciously used—and this means a balancing with still 
other tests and other relevant factors—tests certainly help in the 
formulation of reasonable approaches to some of the problems 


that arise with students. The role of the interpreter, 


however, is 
strategic. 


KINDS OF TESTS USED 

With some of the foregoing considerations and reservations in 

mind to help account for our selection of 

another, I will now describe the purposes fo 
and give the titles of tests currently in use. 


some one test over 
r which we use tests 


Scholarship examinations 


We have the usual problem of the small 
wisely dispensing the generous, but alwa 
funds. We observe a graduated scale o 
want our larger awards to go to the 


Private college in most 
ys inadequate, scholarship 
£ grants, and naturally we 
most able of those who can 
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make best use of assistance. We utilize the scholarship application 
form and data-collecting service provided by the Educational Test- 
ing Service, but the Admissions Office staff makes the final in- 
terpretation of need. Personal interviews are held with all ap- 
plicants. 

Earlier, most Chatham applicants were Pennsylvanians. Origi- 
nally, this group was tested at the college in the spring. To them we 
gave such tests as the Iowa High School Content Examination and 
the Test of Critical Thinking. As the radius of applicant clientèle 
broadened in accordance with a new emphasis in the admissions 
program, the testing of applicants at the college became increas- 
ingly difficult. Therefore, in 1954 Chatham began to require that 
scholarship applicants take the College Board Scholastic Aptitude 
Tests and three of the CEEB Achievement Tests, one in English 
and the other two of the applicant’s choice. It continues to use 
the locally administered tests mentioned above in a few instances 
of late “deciders.” 

I might add, to further complicate this story, that some of our 
scholarship applicants have been taking neither our own schol- 
arship battery nor the College Board battery, but the American 
Council Psychological Examination and an achievement exam- 
ination developed especially for a county scholarship testing pro- 
gram. Since competition in the latter is high, we know that stu- 
dents in the top third of the group are exceptionally able. 
Therefore, when our applicants come from the high end of that 
distribution, we seldom need to test further. 

This means, then, that when we begin to assign funds we may 
have a group of applicants with three separate kinds of test data— 
a situation less complicated than it seems. With a large pool of 
able students adequately identified by any of these methods, 
recommendations, other personal qualifications, and need become 
the final weights in decision-making. 


*The Test of Critical Thinking, Form G, and the ‘Test of Science Reasoning 
and Understanding were developed by the Cooperative Study of Evaluation in 
General Education of the American Council on Education. For availability, write 
to Educational Testing Service, 20 Nassau St., Princeton, N.J. For a list of the 
Study tests and inventories deposited with the ETS, see Paul L. Dressel and 
Lewis B. Mayhew, General Education: Explorations in Evaluation (Washington: 
The Council, 1954), Appendix I, p- 287. Í 
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Admissions Tests 


With increases in tuition and a corresponding increase in 
number of scholarship applications, with extension of our geo- 
graphic borders which makes local pretesting infeasible, with an 
increased demand from other colleges that applicants in whom we 
share interest take the CEEB Scholastic Aptitude Tests, and with 
our own long-established policy of requiring the Scholastic Ap- 
titude Tests not only of scholarship applicants but of doubtful 
candidates, we were finding by 1954 that three-fourths of our appli- 
cants had already taken the Scholastic Aptitude Tests, and we 
knew, too, that having that measure for them in advance fre- 
quently was most useful. Therefore, it was inevitable that we 
should become one of the colleges requiring the Scholastic Apti- 
tude Tests of all applicants. This occurred in 1956. With the en- 
trants of 1958 we also began to require of all applicants three Col- 
lege Board Achievement Examinations. One of these is English 
and the other two are of the student's choice. 

We continue to use other tests in admissions. There is always 
the student who failed to get to the examination center and in 
whom we are interested anyway! (And there is still an occasional 
high school that never heard of the College Boards!) While the 
large majority of admissions cases are clear-cut, with data and rec- 
ommendations consistent enough to inspire confident decisions, as 


many as a fourth of the cases show warning signals which experi- 


ence has taught us to heed. We watch for conflicts among grades, 
1Q’s, the various 10’s w 


hen more than one IQ test has been used, 
College Board examination reports, independent high school test- 
ing reports, Regents Examinations reports, state-w 
ports, and so on. The most frequent conflict is that between grades 
and examination data, with the latter lower, 


In such cases, we 
make further inquiry and frequently require fu 


irther testing. Re- 
testing in an “easier” environment with nonspeeded tests occasion- 


ally explains away a discrepancy between a good high school re- 
cord and a poor examination performance. (I venture that this is 
true for one out of five retest cases.) The applicant is tested at the 
college, when that is possible, or a selection of tests is sent to a 
school or college for her. What is sent varies with the case, but 
usually we send several tests from the following list: the Test of 


ide testing re- 
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Critical Thinking, the Test of Science Reasoning and Under- 
standing, two of the USAFI Tests of General Educational Develop- 
ment (Interpretation of Reading Materials in the Social Studies 
[high school] and Correctness and Effectiveness of Expression [col- 
lege]), the Iowa High School Content Examination, the Nelson- 
Denny Reading Test, and the Cooperative General Culture Test 
(for transfer students only). On most of these tests, the college has 
its own norms based on the performance of six to twelve entering 
freshman classes. 

The tests which are given to all freshmen at entrance serve the 
following purposes: 

l. Describe the student for general purposes of immediate aca- 
demic advising and sometimes for later advising in helping her 
make decisions; 

2. Inform faculty members about the general caliber of students 
that the college receives and the caliber of students who eventually 
enter their particular courses or major areas; 

3. Provide the several counseling deans and faculty counselors 
to students with general estimates of students, so that they have a 
more complete understanding of the capacity of any student with 
whom they may have special dealings (for example, they might or 
might not encourage a student to assume heavy work in leadership 
responsibilities with such evidence at hand); 

4. Screen those students who will be given an opportunity to 
try for exemption from certain required courses; 

5. Decide on best placement in foreign language and mathe- 
matics courses; and 

6. Occasionally, with later posttesting, determine the gains 
made by the group in some subject. 

With so many purposes to be served, and with the time available 
generous but never quite what is needed, we have had to trim test 
use closely. The present required entrance tests include: the 
School and College Ability Tests (only recently replacing the 
ACE Psychological Examination); the two GED tests, Interpreta- 
ton of Reading Materials in the Social Studies (high school), and 
Correctness and Effectiveness of Expression (college)?; the Nelson- 


. * Available from the Veterans’ Testing Service of the American Council on Educa- 
tion, 1785 Massachusetts Ave., N.W., Washington, D.C. 
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Denny Reading Test; the Test of Science Reasoning and Under- 
standing; the Test of Critical Thinking, the Cooperative Foreign 
Languages Tests, the Cooperative Intermediate Algebra Test, and 
a locally constructed, brief exemption screening test related toa 
freshman course, Human Development and Behavior, for which 
no published test is available. 

The results for the first five tests named above are listed and 
reported in “profile” form to faculty advisers and deans, with all 
faculty also receiving percentile score lists. The results for the re- 
maining examinations, such as the foreign language examinations, 


go only to the faculty concerned. These items are delivered prior 
to the first registration, 


Exemption examinations 


The exemption examinations, for those subjects in the required 
basic curriculum from which th 


meets the standards, 


ness of Expression,? to screen t 


lish composition exemption examination, which is locally pre- 


pared. The GED test, Interpretation of Reading Materials in the 
Social Studies, screens for both 


History of Western Civilization 
and Problems of Modern Society, with distribution of high school 
units in history also a consideration for selection for that examina- 
tion. The history and social scien 
locally constructed, although the 
time to time with such standardized 


In 1959 we may be able to replace this examination jn our battery with the 
College Board Achievement Test in English, y 
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about 75 screenings and 25 successful exemptions, excluding for- 
eign language exemptions, in which approximately 30 percent of 
the group qualify at the required two-college-year standard. 

There is under review, currently, a policy for accepting in lieu 
of some of our exemption examinations the results for those Ad- 
vanced Placement Examinations of the College Entrance Exami- 
nation Board which are pertinent to our basic curriculum require- 
ments. 


Course tests 


Although occasionally an instructor may wish to give some pub- 
lished test in one of his courses, this is exceptional. Standardized 
tests seem to have a way of simply not fitting; or, when they do 
fit, the norms are completely unsuitable. Generally, the needed 
kind of course examination simply does not exist; the few that 
bear titles similar to the course either do not cover the objectives 
of the course or else they are weighted differently. This is true 
both of college courses in general education and of courses that 
are traditionally of the college content type. 

For the most part, then, instructors write their own examina- 
tions, although they may borrow heavily, with the proper permis- 
sion, from the examinations of friends located in other colleges 
and universities. A welcome recent development under the spon- 
Sorship of the Educational Testing Service has been the collection 
and cataloging of college test questions, which may be made availa- 
ble gradually by subjects in bound volumes. Just one of these 
catalogues, in the natural sciences, is now ready.‘ The idea is that 
any instructor may select ready-made questions to suit his own 
teaching objectives and emphases. The Office of Evaluation Serv- 
Ices helps any instructor or group that is interested to set up exam- 
mations and to analyze them after administration. When registra- 
tion supplies the necessary populations, the best of the material 
from examinations in the general education courses is culled for 
future use with new materials, which are, in turn, analyzed. It is a 
More or less continuous process of sowing, reaping, and sorting. 


= Paul L. Dressel and Clarence H. Nelson, Questions and Problems in Science: 
est Item Folio No. 1 (Princeton, N.J.: Educational Testing Service, 1957). 
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General examinations 


There are batteries of examinations which can be used to evalu- 
ate achievement in very broad terms for some major objectives 
of general education. We have used such batteries from time to 
time and will undoubtedly continue to do so. I refer to such bat- 
teries as the National College Sophomore Testing Program, the 
Graduate Record Area Examinations, the Tests of General Edu- 
cational Development, and the less well known and not too gener- 
ally available tests and inventories of the Cooperative Study of 
Evaluation in General Education of the American Council ou 
Education, which stresses critical thinking skills. A college which 


is interested in a general self-evaluation may want to make use of 
each one of these batteries at different times, 


an individual and important validity and a 
student and type of institution represented 


since each series has 
different quality of 
in its norms. 


TESTING FOR SOME SPECIAL PURPOSES 
Besides these regular functions of testing at Chatham, other 
cial applications of tests are mad 
the application of a student w 
from high school, but who, 
and wishes to enter. We hav 


spe- 
e. Our college sometimes considers 
ho has not been graduated formally 
nevertheless, seems ready for college 
e been admitting one or two such stu- 
dents each year for about ten years, Occasionally an applicant has 
taken a commercial progr: 


am in high school, but finds it pos- 
sible and desirable at the 


end of her senior year to attend col- 
lege. Occasionally some older applicant wants to enter, but, after 


her long absence from study, lacks the confidence to do so. Test- 
ing is helpful in all of these cases. We have sent tests for students 
in foreign countries, not so much to decide whether or not to ac- 


cept them, for such students usually have been highly recom- 
mended, as to know how fast the 


P ae expected to adjust 
academically and what kind of ini 
transfer students offer preparation 


ting, do a better job 


luate the Preparation 
ckground that we did 
udent who completed 
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a secondary school training program in a specialized school for the 
blind, and several young students from other countries who had 
had only one year of high school in the United States. 

It cannot be overemphasized that we do not formulate decisions 
completely on test results in such cases, but they do help im- 
mensely. In all these instances, we use, whenever we can, examina- 
tions which we have already used for regular purposes and for 
which we have developed norms; but they may not always be suit- 
able, so we sometimes turn to other measures. We keep a fairly 
large stock of sample copies of all the Cooperative tests and Form 
B, USAFI subject examinations. Of course, we also consider care- 
fully any previous test records that exist for the student, writing 
for them if the student tells us that she has taken tests. 


INDIVIDUAL GUIDANCE USES 


Scarcely a day passes when the test records on file in our office 
are not utilized for at least several students. At certain times (when 
students are declaring majors, when course grades are due, and so 
on) the incidence is higher. Although reports are sent to all faculty 
members so that they may make their own general interpretations 
of the test scores, many prefer to verify their conclusions at the 
office. Even the more experienced staff counselors do so. The types 
of questions that come up are of considerable variety; generally, 
they are of some such nature as these: 

Joan is not participating in class discussion; nothing seems to interest 


her. I feel that I may be getting nowhere with her. Is the trouble with 
me or with the student? 


5 Nancy wants to reduce her seventeen-hour program to fourteen hours 
ecause she says she can't keep up with her assignments. Should I 
encourage her to do so? 


Ruth’s family has had financial reverses and she may have to with- 
draw from college unless we offer scholarship aid. How much of our 
emergency funds can we justify for this student in terms of what we 
have done for others of the same potential? 


Loretta has a sinus condition which is causing frequent absences. 
Her family is trying to decide whether she should withdraw from col- 
lege or whether she will be able to continue with frequent absences. 
Does it seem that she could carry on independently? 
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. à ws e is 

Betty wants to work six hours a week in the dining room. repels 
already working Saturdays off-campus. We don’t think she -_ em 
money, but we need the service. Should she be encouraged to do this? 

. > so t 

Judy wants to study French privately this summer and try to exemp 
her language requirement in the fall. Is this advisable? 

Susan wants to enter a 
Record Area Examinatio 
tions? 


graduate school which requires the Graduate 
ns. Is she likely to do well in those examina 


Joanne is makin 


g the lowest grade in the class. Can she be expected 
to do better? 


Catherine is bein 
she b 
ing? 


considered as an editor of the yearbook. Should 
e encouraged to seek this appointment, which is very time-consum- 


Dorothy has been good in m 
standards of English proficie 
she know better, or is this a c 

Gladys is bein, 
provide any stat 


y classes except that she cannot meet my 
ney in her written work. Why? Doesn't 
‘ase of carelessness? 

& considered for a job by 
ement of her test records tH 
in securing the job to send along with her 


a local industry. Can you 
at may be helpful to her 
other records? 


EDUCATIONAL RESEARCH STUDIES 
Test records of students are ke 


ou Pt readily available so that prob- 
lems that arise in relation to the Program in 


i > ertain admissions policies operate 
to improve the quality of candid 

education program as able as other 
standards of the college, its departments, and 
structors reasonably fair? What caliber of students transfer? How 
well does our education curriculum, developed after cineca 
troversy, prepare students? How may a class be divided into units 
of fairly equal caliber for experimental instructional purposes? 
How do students of different caliber 


Jent 1 fare under different instruc- 
tional organization or methods? Whic} high schools may regularly 
ò 


ps? Are grading 
its individual in- 


i 
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be depended upon to refer acceptable applicants to us; which, in- 
ferior applicants? (The latter obviously do not understand our 
standards and need enlightenment.) Is student performance in the 
general education program the equivalent of normal expectation 
in other liberal arts colleges? Are there unequal achievements by 
areas to be resolved? Does work in certain courses stimulate stu- 
dents to keep abreast of current affairs? Who are the under- 
achievers among students? Does the relatively heavy emphasis in 
general education affect achievement adversely in the field of the 
major? Or does it support the major? 

The above questions have actually arisen in the past year or two. 
Each year will bring new ones. Most of these are reported on di- 
rectly to the originating group, but all are briefly reported to all 
faculty in a bulletin issued periodically by the office. Faculty also 
receive lists of the results of any tests that may be given to students, 
along with appropriate instruction for interpretation. A Faculty 
Handbook also carries a dozen pages or more describing the gen- 
eral functions and services of the office, chiefly for the benefit of 
new faculty. Not all of these services are test-centered and hence 
not all have been described here. 


To summarize: 


l. Tests are useful and important in conscientious educational 
planning; 

- Tests must be taken with a grain of salt—no test is infallible; 

3. The administration of many tests and the processing of much 

test data are meaningless and expensive routines unless the re- 


sults are put to work for the individual and for the total pro- 
gram; 


2 


Test interpretations cannot exist in a vacuum, but must be 
utilized differently and varied with the kind of decision that 
is pending; 

9. A testing program is not the prerogative of the larger institu- 


tion, but is just as much needed, and eminently more flexible, 
in the smaller institution. 


9. The Testing Program of the College of 
the University of Chicago 


CHRISTINE McGUIRE, Examiner in the Social Sciences 


EACH YEAR THE COLLEGE OF THE UNiversiry or CHICAGO EN- 
rolls some five hundred entering students in a variety of four- 
he A.B. or B.S. degree. The stu- 


is, of course, dependent on his academic 
and professional interests, But all programs are alike in that each 


consists of two major components: general education and spe- 
cialized education. nd content of the specialized 
demands of the student's field 


eral, these requirements are deter: 
priate academic department. Hoy 
note that though the purpose of tl 
is the same for all students, namely, “to give a common, critical 
understanding of the major fields of human knowledge and their 
interrelationships,” both the length and content of that particular 


$ : ‘ y also vary, depending on: (1) 
their previous education and (2) their chosen fields of specializa- 
tion. 


t¢ general education component 


velop cer- 


jor fields of 
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knowledge. These revisions in the curriculum were accompanied 
by sweeping changes in traditional educational practices. These 
latter were designed to implement a plan of giving greater inde- 
pendence to students of exceptional ability by freeing them from 
all requirements of course credit, class attendance, daily and 
weekly assignments, and the like. All students were encouraged, 
but were not required, to attend lectures and class discussions, to 
submit for criticism written or laboratory work, or to study more 
or less independently. By the same action the instructional staff 
was relieved of the onerous roles of disciplinarian and judge and 
was freed to devote more effort and attention to its teaching duties. 

It soon became clear that as a by-product of this release of both 
students and teachers for the pursuit of more appropriate educa- 
tional purposes, grades and teachers’ opinions of students could no 
longer be used to enforce attendance or coerce particular work and 
study habits, It was hoped that, as a result, students would be 
moved by more appropriate motives in selecting the particular 
educational experiences in which they elected to participate. For 
the realization of these aspirations, it was felt essential to substi- 
tute a system of comprehensive examinations covering a year’s 
work in each broad field for the usual devices of course grades and 
credits, 

An Office of the University Examiner, independent of the in- 
structional staff, was therefore established. It was the function of 
this office to plan, construct, and administer’ the various compre- 
hensive examinations required in the general studies program, and 
on invitation to cooperate with other faculties in the university 
who desired its services on particular examination or evaluation 
programs. A board of examiners composed of representatives from 
the several faculties was established to determine the general poli- 
cies of the Office of the University Examiner. Responsibility for 
specific policy with respect to particular comprehensive examina- 
tions was divided between the instructional staff in the relevant 
general courses and the examiner. The instructional staff was to be 
solely responsible for determining the objectives of each general 
course, and the examiner was charged with the technical job of 


hi Ser seventy, responsibility for the actual administration of all examinations 
ias been placed in a separate Office of Test Administration. 
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designing test exercises which would measure student achievement 
of these objectives, of grading the examinations, and of certifying 
the results to the registrar.? ; 

The theory of the testing program at the University of Chicago 
as it has developed from this early beginning can be briefly sum- 
marized as follows: ; 

First, the program assumes that the evaluation of academic 
achievement of students is most reliable and val 
on students’ performance on 
strate the knowledge, skills, 
curriculum, 

Second, these evalua 
valid and reliable if th 


id if based directly 
tasks which demand that they demon- 
and ability which are objectives of the 


tions are regarded as significantly more 
ey are made in such a way as to be inde- 


considerations as the length of time he h 
in performing assigned work, 
Third, it a 


as served or his diligence 


information, 

Fourth, it is felt that 
ured only if the studer 
skills to new problems, 

Fifth, it is Considered of basic im 
evaluation should be closely relate: 
problems and should b 
structors in the plannin 


the more important objectives can be meas- 


nt is required to apply his knowledge and 


portance that all testing and 
d to teaching and curricular 
€ utilized to guide both students and in- 
g of their educational experience. 


KINDS OF TESTS USED 
On the basis of the: 


of general education 


*Since the initial Preparation of this manuscript, the admin 
which the examiners in 


: iners or the general 
practices which will be described, 
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ward the objectives of the program, and final comprehensive ex- 
aminations of achievement in particular parts of the general edu- 
cation curriculum. The tests themselves, their major purposes, and 
the uses made of their results are described below. 


Entrance examinations 


Applicants for admission to the University of Chicago are re- 
quired to take an entrance or scholarship examination which is 
designed to provide evidence about the candidate’s ability to do 
academic work at the level required in the College of the Univer- 
sity of Chicago or in the division to which the student is applying. 
At present, this entrance and scholarship battery consists of the 
Scholastic Aptitude Test of the College Entrance Examination 
Board,’ a test of reading comprehension, and a test of skill in 
quantitative reasoning. All tests except the first are composed 
locally by the Office of the University Examiner. 

All entrance tests are designed to help answer the question: 
Does this particular student have a reasonable chance of succeed- 
ing in the University of Chicago program considering the expected 
level and pace of work and the nature and conditions of instruc- 
tion offered? In the case of the scholarship applicant, the question 
becomes: Does this particular student have a reasonable prospect 
of doing more than satisfactory, perhaps even distinguished, work 
in this type of program? Hence, the tests are designed to give 
evidence about the specific abilities needed to meet the demands 
of this particular program. 

To illustrate: The program stresses the importance of dealing 
directly with primary source materials. Students are normally ex- 
pected to use original sources rather than to rely on reading about 
such documents. As a result there is a substantial amount of read- 
ing in the undergraduate program which places a premium on 
the experience that students can use in its interpretation. In the 
classroom, there is almost no lecturing or quizzing about this read- 
ing. Rather it is expected that students will be able to participate 
in a group attack on a common problem and to discuss intelli- 
gently the nature and ideas of the various documents considered. 


3 i " è 
Candidates from local high schools are permitted to substitute the American 
Council Psychological Examination for a part of this battery. 
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It is important, therefore, that the reading comprehension test m 
one on the basis of which we can predict whether or not studen 

will be able to share in and profit from such discussions. epee 
quently, the reading test is neither a measure of reading speed, A 
of simple understanding of the words or sentences, but is aather a 
test of the ability to deal with ideas presented in short passages rep- 
resentative of various styles and subjects. Other components of the 


: . ae of 
entrance test battery are designed on the basis of similar types 
considerations. 


One other set of issues has b 
design of entrance tests, Often colleg 


criticized on the ground that their admis 


the appropri 
whose high school ex 
posing a patte 
gated to meet. 


ns without penalizing those 
ventional and without ame 
that high schools feel obli- 


perience is uncon 
t™ of rigid requirements 


vant policy groups to be sure that t} 
at A w . D 

level thie ranas pi aputa vhich actually Ought to be important 

in determining admission to the university, 

Test results for each applicant, together with, 

are made available to the Office of Admissions 

visers by the Examiner’s Office. The latter hasn 

decision about an applicant but, in reporting th 


interpretive data, 
and to student ad- 
O power to make a 
e level and pattern 


aS SSS a c Å- aam rr 
- 


COLLEGE OF THE UNIVERSITY OF CHICAGO 135 


of scores for each prospective student, identifies those suggestive 
of probable failure in the program or those which suggest the 
desirability of various types of remedial action, prior to or con- 
current with the student’s first year in the program. For instance, 
in borderline cases remedial reading and/or writing programs may 
be recommended for some students. Cases requiring other special 
action are often drawn to the attention of the Admission’s Office or 
adviser’s office. It should perhaps be stressed that no single cri- 
terion is by itself critical in determining an applicant’s admissi- 
bility to the university. Rather the constellation of a student’s 
scores is considered, together with information from interviews 
and other data about him. These data as now employed assist the 
Office of Admissions in selecting students who are likely to com- 
plete successfully the requirements of the particular parts of the 
university to which they are applying. 


Placement tests 

It should be obvious from the above discussion that the en- 
trance test results are widely used by advisers and are available 
to faculty for the purpose of advising the student about his pro- 
gram, but never for the purpose of determining his requirements 
once he has been admitted. The tests which we are about to de- 
scribe can be said to have almost the reverse function: they are 
used primarily to determine the student’s general education re- 
quirements and only secondarily to advise him regarding the best 
means of meeting those requirements. 

It may be necessary at this point to describe briefly the general 
education component of the undergraduate curriculum in order to 
make clear the way in which the placement tests function. The total 
offering of general education from which a student’s requirements 
are: selected consists of three, three-year sequences: one each in 
Social sciences, humanities, and natural sciences; a one-year se- 
quence each in mathematics, English, and foreign language; and 
two additional one-year sequences, one organized on historical 
principles and the other on philosophical principles, designed to 
assist the student in achieving an integrated view of his educational 
experiences. As stated above, the student’s general education re- 
quirements are determined from among these fourteen general 
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courses on the basis of his previous academic achievement and his 
plans for future specialization. It is in the determination of mis 
previous academic achievement that the placement testing pe 
gram at the university is relevant. A student is not able to satis! y 
requirements in the general studies program by presenting credits 
for satisfactory work done in related courses in high school or 


other colleges. Rather he must demonstrate that’ he h 
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course. At the same time, the test items must be independent of 
specific reading lists and of specialized terminology which is not 
in itself essential to the objectives of the course. The tests must be 
independent even of those concepts and approaches to a field 
which may be peculiar to a particular course, faculty, or series 
of readings and, hence, parochial in the sense that they are not 
generally familiar to people who are well informed and competent 
in the field. 

As is apparent from the above discussion, the knowledge, skills, 
and abilities which should be sampled in any placement test are 
determined by the objectives of the relevant course. The College 
faculty generally, and specifically the staff responsible for a par- 
ticular general course, determines the objectives which constitute 
the basic specifications of a placement test. Ordinarily these spe- 
cifications are so formulated that they identify both the skills a stu- 
dent is expected to develop and the content or problem areas to 
which he should be able to apply these skills. It is then the duty of 
the Office of the University Examiner to design tests which are 
valid in measuring student achievement in the designated respects 
and which are at the same time appropriate for students who 
represent a wide variety of previous educational experience. 

_In principle, the system described here implies a sharp separa- 
tion of responsibilities: the policy decisions to be made by the rele- 
vant faculties or course staffs and the technical implementation 
of those policies to be accomplished by the Examiner’s Office. In 
practice, however, these administratively independent groups co- 
Operate fully both in the determination of objectives and in their 
Implementation in the testing program.* The examiner in a given 
subject is also engaged in some teaching duties on the staff for 
which he is examiner and both formally and informally works 
Closely with that staff. He often finds it necessary, for instance, to 
Stimulate the staff to revise or reformulate its objectives so as to 
make them sufficiently specific for test and evaluation purposes. He 
frequently finds that in faculty discussion of specific test materials 
objectives are clarified and, at the same time, new devices for meas- 
uring them are suggested. Similarly staff criticism of test materials 


e S 
siete een of the instructional and examining functions was recog- 
ed explicitly in the administrative reorganization referred to above. 
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is considered in revising and editing, not only specific items, but 
also, on occasion, the general nature of an entire test. Alterna- 
tively, data reported to the relevant staffs concerning initial stu- 
dent competence and deficiencies are utilized in the revision of 
course materials. Finally, the standards for excusing students from 
a particular general course are worked out jointly by the Examin- 
er's Office and the appropriate teaching staff. 

Since, as indicated above, the primary function of the placement 
test is to determine whether or not a student has already achieved 
the objectives of a given general education course, it may be inter- 
esting to note how the data about student performance are used in 
determining each student's required program of general studies. 
Before a placement test is given to entering students, the items are 
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less than the normal time or to take more than the normal load 
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placement test scores together with the above evaluations and 
other data helpful in interpreting the scores are reported to the 
student's adviser, who uses them in combination with a considera- 
tion of the student’s proposed field of specialization, as a basis for 
determining the student's required program of general studies and 
for advising him informally about the combined work-and-study 
load which he should undertake. The data are also available to any 
College staff or individual faculty member who may wish to know 
more about the composition of his current class or who may find 
such data helpful in informal conferences with students. 

This program has had a three-fold result: (1) By placing stu- 
dents in courses that they need and are prepared to take and by en- 
couraging them to progress as rapidly as possible, repetition of 
learning experiences has been minimized and student and faculty 
time is thereby economized. (2) By indicating what courses a stu- 
dent is prepared to study or what remedial work he needs, failure 
among students has been reduced. (3) By supplying the teaching 
staff with evidence about initial student status, it has been possible 
to redesign courses so that they are appropriate for the actual 
Student body. 


Advisory tests of student achievement 


Once or twice during the first two quarters of a three-quarter 
Course students are given the opportunity to take a number of test 
exercises under regular examination conditions. These exercises 
sample the work of the course to date and give the student an op- 
portunity to demonstrate the extent to which he is acquiring the 
knowledge and skills regarded by the staff as important objectives 
of the course. The exercises are composed of questions which are 
similar to those the student will later encounter on the final com- 
prehensive examination and are designed to serve both teaching 
and testing purposes. They are regarded as important educational 
€xperiences in that they, like the class discussion, help to orient 
the student and to inform him about the specific aims to which the 
Course is dedicated; they presumably present him with challeng- 
ing problems which encourage him to apply to new situations the 
knowledge and skills which he has been developing in the course; 
and by giving him experience with such exercises in an examina- 
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tion situation, it is hoped that tension and other obstacles arising 
from inexperience will be minimized in the final comprehensive 
examination. The exercises serve an evaluation purpose both for 
the student and the staff of a course: papers are scored and re- 
turned to each student with an indication of the quality of per- 
formance represented by his score. However, grades on these tests 
are purely advisory, do not form a part of the student's permanent 
academic record, and cannot be used as a means of establishing 
credit to meet a general education requirement. The instructional 
staff receives reports of individual scores for use in advising stu- 
dents with respect to their work. In addition, group responses to 
each item in the exercises are reported to the teaching staff for 
use in assessing the effectiveness of the current selection of read- 
ings and discussions in meeting the objectives of the course. 
The instructional staff has always had the formal responsibility 
of planning, constructing, and administering these tests. However, 
since it is clearly recognized that to be of maximal value they 
should be based on the same principles as the comprehensive ex- 
amination (which is the responsibility of the Examiner’s Office), 
most staffs and examiners have followed the practice of joint 


planning and preparation of these test materials, 


Comprehensive examinations of student achievement 


Repeated reference has been made to the comprehensive ex- 
amination. Initially the student’s general education requirements 
are formulated in terms of the kinds of competence he should 
attain rather than the number of courses he must complete. This 
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that would ordinarily prepare him for it, and he may repeat it 
as frequently as he wishes. 

The system of comprehensive examinations is one of the oldest 
aspects of the testing program at the University of Chicago. When 
it was initiated, there was some concern about the ensuing division 
of responsibility between the faculty, who was expected to deter- 
mine the objectives of the general courses, and the newly created 
Office of the University Examiner, which was to develop compre- 
hensive examinations appropriate to these objectives. In some 
quarters it was feared that an independent office of examinations 
would soon dictate the curriculum and restrict instructional work 
to cram sessions for the final examination or, alternatively, that 
instructors would develop such detailed specifications of objectives 
that the resulting examinations would be merely course examina- 
tions, in no sense comprehensive. Neither of these dangers has 
actually materialized. The close cooperation between each instruc- 
tional staff and the examiner in each field has obviated the first 
problem; at the same time, the existence of a group of individuals 
With common concerns in improving examinations and with the 
time and resources for experimentation in this area minimized dif- 
ficulties of the latter sort. 

From the joint effort of the instructional and the examining 
staffs a number of new types of testing techniques have gradually 
been developed which have influenced both examining and in- 
structional practices. One of the most interesting is the extensive 
use of open-book examinations even with nonessay types of tests. 
Students are encouraged to bring any books and notes they may 
wish to use to the examination. The questions are designed to 
require the students to apply concepts and principles to new 
situations ranging from the analysis in the Humanities Compre- 
hensive of an unfamiliar sonata or painting to the interpretation 
in the Natural Sciences Comprehensive of a scientific paper or 
report of an experiment new to the students and to the evalua- 
tion in the Social Sciences Comprehensive of specific governmental 
Policies not discussed in the relevant course. This particular tech- 
nique has required the Examiner’s Oflice to develop analytical 
rather than informational types of objective questions and has 
freed instructional staffs from the necessity of covering a detailed 
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body of subject matter and encouraged them to develop alterna- 
tive modes of accomplishing the same broad objectives. 

By this time it is probably clear that the comprehensives differ 
from the tests described in previous sections in a number of im- 
portant respects. First, they differ from the entrance and scholar- 
ship tests in attempting to measure achievement in particular 
subjects rather than general scholastic aptitude. They differ from 
the placement tests in that all students who take them can be 
presumed to have at least one set of common educational experi- 
ences (that is, a common reading list); hence questions requiring 
a high level of analysis can be based on particular readings with- 
out fear that unfamiliarity with that specific book or author will 
constitute an unreasonable obstacle. They differ from the advisory 
tests, first, in that performance on them does form a part of the 
student’s permanent record, and hence his interest in them may 
differ; and second, they cover such substantial areas of subject 
matter that it is feasible to require the student to demonstrate 
a level of analysis and integration that would be quite impossible 
in the quarterly advisory tests. 

The characteristics of the comprehensives are reflected in the 
nature of the results obtained from them and the uses to which 
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were initially prepared only for students who made low grades 
on an examination, with the thought that such an analysis would 
be useful to anyone who needed or wished to repeat an examina- 
tion. However, students whose work was clearly satisfactory began 
to request these reports so frequently, that they are now made to 
every student as a matter of routine. 

f Reports to the faculty of the results of comprehensive examina- 
tions have differed over the years and currently vary among the 
several examiners. The instructional staff has access to any of the 
data on comprehensive examinations which are obtained in the 
routine process of scoring papers and reporting student grades. 
In addition, the Examiner’s Office supplies them with any special 
data they may request or find useful in curriculum consideration 
and revision. As a general rule, such data include a summary of 
group responses to each question in the examination, together 
with more or less extensive interpretations of and hypotheses about 
the results. These data and their interpretations are used in vary- 
ing degrees by the several instructional staffs in considering cur- 
Ticulum revisions. 

The Examiner's Office has been continually concerned with 
devising improved means of reporting examination data to maxi- 
mize their usefulness to the faculty in evaluating the efficacy of 
alternative teaching materials and procedures. For instance, on 
certain occasions reports have been made comparing student per- 
formance on questions in placement tests with their performance 
on identical or parallel questions in comprehensives. Such data 
have been used as one source of evidence about the nature and 
amount of improvement in student skills and knowledge resulting 
from the related general course. On other occasions data about 
performance on different types of questions or in different content 
areas have been used as one source of information about the need 
for reorganization of course materials, or for strengthening certain 
aspects of the general education program. On other occasions 
performance of students in different types of programs have been 
compared. This method has been used in connection with con- 
Sideration of variant forms of the general courses and has been 
used in evaluating the consequences of alternative placement pro- 
cedures, 
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OTHER ASPECTS OF THE TESTING PROGRAM AT THE 
UNIVERSITY OF CHICAGO 

The testing program described above is employed in the gen- 
eral studies part of the College curriculum. However, several of 
the divisions and professional schools also employ certain of these 
techniques with their students. Most departments and schools rely 
heavily on data from entrance tests in judging student applications 
for admission. In addition, since the university requires that a 
student have a broad general education before undertaking special- 
ized work in the divisions or professional schools, tests of general 
education (a variant of the placement tests described above) are 
used in the divisions and professional schools to identify the stu- 
dents who have acquired a general education and to indicate the 
fields in which other students are deficient. A few departments 
have used comprehensive examinations in basic skill courses in the 
field of specialization. Many departments require candidates for 
advanced degrees to demonstrate a reading knowledge of one or 
more foreign languages; the competence is certified on the basis 
of tests offered by the Examiners Office. Also that office has 
worked with many departments on special evaluation problems. 

In addition, the Examiner's Office has frequently been invited 
to cooperate in broad evaluation or follow-up studies of graduates 
of the College. In these studies, it has been recognized that the 
effectiveness of the College can and should be measured in terms 
of many criteria other than the level of knowledge and skills it 
would be appropriate to test for in a comprehensive examination. 
Hence, some of these studies have focused on the long-term bene- 
fits of the College program, others on the attitudes and habits of 
College graduates, and still others on their opinions about many 
aspects of College and university life, both curricular and extra- 
curricular. 

Finally, both the Examiner’s Office as a whole and individual ex- 
aminers have participated in a wide variety of research and testing 
programs other than those described above. These have included 
curriculum and testing programs of other schools and universities 
and of varlous professional groups and research programs, includ- 
ing personality and assessment studies as well as those more di- 
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rectly related to the evaluation of patterns ‘of intellectual and 
academic achievement. 

The listing of the various activities carried on by the Examiner's 
Office suggests the variety of personnel currently included on the 
staff. When the Examiner's Office was first established, it was staffed 
primarily with professional psychologists who had specialized in 
the field of tests and measurement. The early work of the office 
reflected their primary concern with the development of improved 
techniques of measurement and the statistical evaluation of these 
hew modes of testing. Many of the early research studies made 
substantial contributions to statistical theory and its application 
to test and measurement problems. More recently, the staff of the 
Examiner’s Office has included a larger proportion of people whose 
initial specialization is in the subject in which they serve as ex- 
aminers. As a result, both the research and writing done by mem- 
bers of the Examiner's Office now reflect their basic concern with 
scholarly problems in the field of their academic interest, issues of 
teaching and curriculum construction, as well as evaluation, and 
finally the interrelation of these various problems. 

The development of the extensive testing and research program 
now carried on by the Examiner's Office was made possible by its 
establishment as an organization which, through cooperating 
closely with the instructional staff, had an identity and functions 
independent of the day-to-day problems of that staff. At the same 
time, members of the examiner's staff have found their work 
greatly facilitated by their association with and participation in 
college teaching. Similarly, the university testing program has it- 
self been enriched by the challenging and stimulating association 
that members of the Examiner's Office have enjoyed in the many 
testing, evaluation, and research projects in which they participate 


outside the university. 


10. The Testing Program of 
Dartmouth College 


CLARK W. HORTON, Consultant in Educational Research 


DARTMOUTH COLLEGE IS A PRIVATE FOUR-YEAR LIBERAL ARTS COL- 
lege for men with a freshman class of about 725 and an under- 
graduate enrollment of about 2,800. It draws its students from all 
over the United States with heaviest representation from the 
Middle West, Middle Atlantic states, and New England. About 
two-thirds of its students are from public high schools, one-third 
from independent schools. Over 80 percent come from the top 
quarter of their classes. Attrition in the first year is less than 4 per- 
cent; attrition before graduation less than 20 percent, The testing 
program at Dartmouth has developed in response to needs for 
information useful in admission, course placement and proficiency 


exemption, guidance, and the evaluation of achievement in courses 
of study. 


ADMISSIONS TESTING 
Beginning with the class of 1956 all applicants have been re- 


quired to submit scores on the College Entrance Examination 
Board Scholastic Aptitude Test. 
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process the College Board scores play an important but not a 
determining role. 

Some years ago the Educational Testing Service studied the 
value of important items of our preadmission data for predicting 
the first-year grade-point average and developed an index in which 
optimum weight is assigned to the SAT-Verbal score, the SAT- 
Mathematical score, rank in high school class, and the principal’s 
recommendation as adjusted by the Admissions Office. This pre- 
dictive index is computed for each applicant, and plays an im- 
portant though not necessarily a determining role in admission de- 
cisions, Continuing studies of the validity of this index yield cor- 
relations of over .60 with the first-year and the two-year grade- 
point averages and of about .55 with the four-year average. The 
results of such studies are also expressed as experience tables, which 
serve as expectancy tables in guidance as well as in admissions. 


FRESHMAN ORIENTATION 

During Freshman Week entering students take a large bat- 
tery of tests. In the fall of 1957 all men took Reading Compre- 
hension, Test C2, and the Mechanics of Expression Test of the 
Cooperative English Test; the Strong Vocational Interest Blank; 
the Minnesota Multiphasic Personality Inventory; and a form 
prepared by the Office of Student Counseling to explore aspects 
of the student’s personal history and his educational and voca- 
tional plans. Appropriate groups take placement and proficiency 
tests in a variety of subjects. All men enrolling for a foreign lan- 
guage course above the beginning level take a College Board Ad- 
vanced Placement Test in that language. Selected groups take 
tests in American history, European history, mathematics, chemis- 
try, physics, biology, and English. In some subjects, CEEB Ad- 
vanced Placement Tests or other published tests are used; in 
Others, the tests are objective tests developed by the departments. 
The latter frequently are tests that have been used as part of the 
final examination in the beginning course, sometimes over several 
years, and on which the scores made by large populations complet- 
ing the course serve as normative data for the appropriate course 
Placement or proficiency exemption of entering students. Under 
the auspices of the Thayer School of Engineering, one of the 
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Dartmouth Associated Schools, interested men take the Educa- 
tional Testing Service Pre-Engineering Ability Test, and the re- 
sults are used in counseling. Candidates for the Army, Navy, and 
Air Force ROTC units take qualifying tests administered by those 
units. All men have a thorough physical examination at entrance, 
and all men not medically excused take four physical ability tests 
developed by our Department of Physical Education. 

The scores on all tests of general interest are put on punch 
cards, and converted to standard scores or percentiles in the tabu- 
lating center. Percentiles are based on the class itself, but tables 
of percentile norms on the population entering over the past sev- 
eral years are available for reference. Alphabetical lists of raw 
scores, standard scores, and percentiles are prepared in duplicate 
and placed in the hands of the Office of Student Counseling and 
the deans, together with appropriate profile sheets, tables of norms, 
and other aids in interpretation. Although there is a system of 
faculty advisers to foster closer student-faculty relations and to aid 
in academic guidance, the Scores are not systematically made 
available to all advisers. Some who are interested and competent 
to interpret’ the scores get copies of the scores reports at their 
request. A current experiment in which six such advisers are lead- 
ing discussion groups in connection with a required freshman 
course, The Individual and the College, offers promise of increased 
interest and competence in the use of test data by faculty advisers. 

Scores on the C2 Reading Comprehension Test serve not only as 


supplementary evidence of academic potential and for diagnosis in 
counseling, but are also use 


to decide about others who k 
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are reported, and these data are used in the Office of Student 
Counseling in conferences on the student's vocational plans and 
course choices. Several hundred freshmen seek such conferences. 
On request of the counselor the original paper may be scored for 
additional occupations. Some one hundred men retake the test in 
the junior or senior year, and their papers are scored for all 
occupations specified by the counselor. The Tuck School of Busi- 
ness Administration, two-year Associated School entered by some 
seventy-five Dartmouth students at the end of the junior year, 
administers the Strong test to its second-year men. The Minnesota 
Multiphasic Personality Inventory was used with the entire fresh- 
man class for the first time in the fall of 1957, and its value to us 
is under investigation. 

Many tests are used with individual students or small groups in 
the work of the Office of Student Counseling, the tests used being 
determined by the nature of the problem. These tests include 
many titles in the categories: general intelligence or scholastic 
aptitude; achievement in academic subjects; tests of more specific 
aptitudes; tests of interest and personality. Other tests are used for 
experimental, demonstration, or guidance purposes in the classes 
of the Department of Psychology; for example, all students in Psy- 
chology 1, some 90 percent of each class, take the Allport-Vernon- 


Lindzey Study of Values. 


PLACEMENT EXAMINATIONS 

The most extensive, long-standing use of tests for placement 
and proficiency exemption is in the foreign languages. The Dart- 
Mouth requirement in foreign language is stated as “the ability to 
read with understanding a representative passage in a foreign lan- 
Suage,” and this is further defined as attainment of a score of 600 
or higher on a College Board Achievement Test, or a passing grade 
I specified courses, normally the fourth-semester course. It is 
Strongly recommended that students continue the language taken 
in high school, although exceptions are made. All students who 
have had instruction in the language for which they enroll must 
either present a score on the appropriate College Board Achieve- 
Ment Test or take a College Board Advanced Placement Test at 
entrance. On the basis of such a score they are either certified as 
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having met the requirement or assigned to a course consistent font 
their achievement. Placement ranges from assignment to a cour 
in which a man can complete the requirement in one semester - 
demotion to the beginning course. The number of years of study 
of a language in high school is not an infallible index of com 
petence in it. = 

The College Board Advanced Placement Tests are used as o 
hour of the final examination in courses above the beginning level, 
and outstanding students in the beginning course are invited 
to take them. On the basis of such scores and other evidence the 
student may complete the requirement, or be given a jump peor 
motion at the end of any semester, or proceed normally through 
the sequence of courses. The College Board tests are used also 2 
part of the validating examination required of students who see 
credit for summer school language courses. Data accumulated 
over the years by this end-of-course testing, and from studies of 
scores in relation to course grades, have been used to establish 
the cutting scores that govern placement. A high proportion of 
the men who satisfy the requirement by test at entrance elect to 
continue language study for one or two semesters. About half of 
the freshmen complete the requirement by the end of two semes- 
ters; some who fail courses require more than four semesters. : 

A similar pattern obtains with respect to the requirement in 
physical education. All men not medically excused take a series of 
four physical ability tests devised by the department. Performance 
on each, and on the total series, is reported in terms of standard 
Scores established over the years on the Dartmouth population. 
Total score is used to classify men into groups A, B, and C. Men 
who achieve and maintain an A classification are exempt from 
classes; men in group B have a choice of courses; men in group 
C are limited in choice, or may be assigned to special remedial 
classes. The scores on the specific tests are used diagnostically 1n 
this way. The tests are repeated at the end of each semester, and 
a man may attain a higher group and, hence, exemption or in- 
creased freedom at such time. Men who attain B at the end of 
the third semester have satisfied the requirement, but, failing 
that, are required to complete four semesters of classes. 
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EXEMPTION EXAMINATIONS 

The establishment of a standing Committee on Proficiency and 
Placement has given impetus to the program of placement and 
proficiency exemption in academic subjects. Cognizant of the 
enormous range of ability and achievement in particular subjects 
of study present in any entering class, this committee has sought 
to identify men of outstanding competence; to exempt them from 
part of the distributive requirement in the areas of their compe- 
tence; to effect their placement in advanced courses or in honors 
sections; and in a few cases to grant college credit and resultant 
advanced standing. The motivation for this work has been the 
desire to remove exceptionally competent students from the stulti- 
fying experience of repeating course work and to place them in a 
challenging situation where they can progress more rapidly in 
accordance with their better preparation and greater ability. 

During the summer the dean of freshmen examines the second- 
ary school records of all freshmen, selects some 250 men with out- 
standing records in subjects of study, and writes to them suggest- 
ing that they take proficiency tests at entrance. Most of these men, 
and some additional candidates, do take the tests and some 175 
proficiency exemptions are granted. Such exemption gives freedom 
from part of the distributive requirement, but not college credit 
nor necessarily advanced course placement; and its significance 
varies with the subject. For example, a man pursuing a preprofes- 
sional program in a science may still be required to take the basic 
course in physics or chemistry, even though he demonstrated the 
proficiency required for freedom from part of the distributive 
requirement in science. On the other hand, he may be relieved 
of part of the social science requirement by demonstrating pro- 
ficiency in history. Additional, more difficult tests are used to 
screen men for advanced course placement in some departments; 
In others, men are put in honors sections where the pace is faster. 
In mathematics, for example, no proficiency exemptions are 
granted, but some seventy-five men selected through scores on the 
SAT-Mathematical Test and the College Board Advanced Mathe- 
matics Test are taught in special sections. Men granted exemption 
from English 1 are encouraged but not required to elect a special 
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honors course. This is a developing program, which has been stim- 
ulated by the work of the School and College Study of Admission 


with Advanced Standing and by the College Board Advanced 
Placement Test program. 


SERVICES TO THE FACULTY 


The Office of Educational Research at Dartmouth operates as 
a service office available to individual members of the faculty, 
departments of instruction, committees, and officers of adminis- 
tration. Its services include the typing, reproduction, and assembly 
of test papers; machine scoring and score reporting; test storage 
and control; studies of the validity and predictive value of test 
d, more generally, consultation 
arch projects. It was created to 
l; it attempts to do so, in part, 
nd other data-collecting devices 
te treatment and use of the data. 


educational experiments. 

culty is voluntary, with the sole 
copy and reproduces the papers 
n program. There is a standing 


resolution of conflicts, and action on 
lished regulations; but except i 
committee does not control the ki 


Stance, not one of contr 
The availability of this service has fostered the devel 


large program of objective course examinations, man 
high quality. The office helps, on request, by Providing specimen 
test forms, by advising on test construction and Scoring, and some- 
times by critically reviewing the questions. All question-writing 
is done by the course instructor or course staf. The answer sheets, 
special pencils, scoring, and all other services are provided without 
charge against course budgets. Normally, scores are reported with- 
in a few hours after receipt of the papers. The test booklets and 


ol. 
opment of a 
y of them of 
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answer sheets are serially numbered, carefully controlled, stored, 
and, if the instructor wishes, may be used over a period of years. 
Such repeated use of good tests is encouraged because it permits 
comparison of classes from year to year and tends to stabilize grad- 
ing practice, and because the records of scores made by large 
populations at the end of a course permit such tests to be used 
effectively in the course placement or proficiency exemption of 
entering students. 

Objective tests prepared by teachers are used widely both as 
hour examinations during the semester and as final examinations. 
Some 40 percent of total course registrations are represented by a 
machine-scored paper in any final examination period. A system 
is followed of assigning each machine-scored test an accession 
number to facilitate filing and control; this number series is now 
over 1,400. Machine-scored tests are used more commonly in the 
large beginning courses than in advanced courses, and more com- 
monly in the sciences and social sciences than in the humanities. 
It is common practice in the social sciences and the humanities to 
devote one hour of the final examination to an objective test and 
one hour to the traditional essay test; there is a tendency to use 
all objective tests in the sciences. In all courses, however, the 
evidence derived from objective tests constitutes only part of the 
evidence used in evaluating student achievement and determin- 
ing course grade. 

The careful study of test questions is encouraged both as an aid 
in test improvement and an aid in instruction. The method of 
study varies with the case, but always includes a report of the 
percentage of students who gave the right answer to each ques- 
tion and some index of the question’s discriminative power. Ques- 
tions which prove too easy or too difficult, or which otherwise do 
not work well, are called to the author's attention for revision or 
replacement. As an aid in instruction, teachers for some courses 
request a study of the test questions immediately after scoring. 
The percentage of students who selected wrong answers to each 
question is quickly determined and reported to the course staff, 
who then devote the next class meeting to a discussion of those 
errors and misunderstandings most prevalent among their stu- 


dents. 
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At Dartmouth, as elsewhere, the quality of teacher-made tests 
varies widely, depending on the quality of thinking and the 
amount of hard work that go into their preparation, and to some 
extent on the nature of the subject of study and the aptitude of 
the author for this work. Similarly, understanding of the meaning, 


limitations, and legitimate uses of the scores from professionally 
made, 


present a difficult barrier to 
ional process. 

very college, no matter how 
n which the primary responsi- 
faculty and administration in 
ducational data. Whether it is 


is of little consequence. Precise 
institution, but the funda- 
ontributions tests can make 


and reporting, and the subsequent studi 
be made if the data are to be used 
ance with the latter does much t 
ultimately to gain for an institution 
its educational program that derive 


© overcome the former, and 
the unquestionable benefits tO 
from a wise use of tests. 


Il. The Testing Program of 
the College of Arts and Sciences 
of the University of Louisville 


J. J. OPPENHEIMER, Dean of the College 


Tur COLLEGE oF ARTS AND SCIENCES IS A PART OF THE UNIVERSITY 
of Louisville, a municipal university. The college is fifty years 
old, although the medical unit of the university dates back to 1837. 
In the last twenty-five years, the enrollment of the college has 
fluctuated from 800 to a postwar peak of 2,800. Today's enrollment 
stands at 1,250. The college is coeducational, and, in 1951, the 
merger of the Municipal College for Negroes with the University 
of Louisville marked the beginning of integration in all colleges 
composing the university. Since about 90 percent of its students 
live in Louisville or Jefferson County, the College of Arts and 
Sciences is an urban, commuters college. 

In 1928, upon the request of the Board of Trustees, Dr. F. J. 
Kelly, then dean of administration of the University of Minnesota, 
surveyed the educational needs of the college. Included in the 

Kelly Report” was the recommendation that a testing program 
be established as part of the admissions program for freshmen. In 
1929 the American Council Psychological Examination was given 
for the first time and has been used annually since that date. In 
1932 the faculty set up a comprehensive plan of reorganization of 
the college and included in it a testing program to be used in the 
admission of all new students, student counseling, the granting 
of any advanced standing, admission to the senior college, and 
graduation. 

In a real sense the testing program of the college has followed 
the development of the general nation-wide programs for college 
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testing, for basic policy has been to use only nationally devised 
tests. As better national tests have been devised, the testing pro- 
gram of this college has been improved, Although those ad- 
ministering the program have felt that such policy is limiting in 
scope, the college has been without sufficient financial resources 
to supplement the program with locally designed tests. As a practi- 
cal matter, however, it is felt that the use of national tests has 
proved far more adequate than any that like expenditures could 
have produced locally. In brief, it is believed that for a medium- 
sized college such as this one, nation-wide tests or programs are of 
great value. 

Upon recommendation of the faculty and the dean of the 
the Testing Service was established 
ees as a service agency in the office of 
e rank of assistant to the dean. Poli- 
tion of the office are determined by 
culty action, and administration of 
der the direction of the dean. Faculty 
vith the director and the dean, assist 


forms such functions as: scheduling 
g tests and recording results, notifying the 
of new students, notifying students of 
reting scores to them, and organizing 


S 1 ly reports of test results to faculty, deans, 
and the registrar. The Testing Service js equipped with an IBM 


OBJECTIVES OF THE TESTING PROGRAM 


have always felt that the testing pro- 
ne educational process of the college. 
a par with library services, regis- 
- It has been used to provide a 


ing system and the caliber of graduates 
and for follow-up of graduates, It also represents an attempt to 


, ee 
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examine by the best means at hand, the academic and intellectual 
quality of those entering the college, and to ascertain, midway 
and just before graduation, content achievement in the major 
areas of learning and in the major subject field. In a broader 
sense, it provides various kinds of measures of the prevailing cur- 
riculum with its elements of teaching and subject matter, as re- 
vealed in the learning of students. The testing program is far from 
a complete measure, and has never been accepted as such; but it 
has been one kind of measure of the academic accomplishments of 
both students and faculty. In this same respect, it has served as a 
control measure of curricular changes. 

One of the controversial elements in the 1932 reorganization 
plan was its program of general education, which at that time was 
looked upon as far more radical than it would be today. Faculty 
acceptance of the plan was contingent upon systematic evaluation 
of the general education courses by testing students in these gen- 
eral areas at the end of their sophomore year. The writer is sure 
that, if this condition had not been promised, and if students had 
Not achieved standings comparable to the general national aver- 
age of like colleges, the success of the general education program 
Would have been greatly impaired and the program possibly aban- 
doned. Faculties are conservative when entering into any new edu- 
cational adventure, and more than “off-the-cuff” faculty opinion is 
necessary to demonstrate the success of such a venture. Test re- 


sults furnish substantial bases for sound decision, raising many 


serious questions in regard to teaching, course content, standards of 
many other academic 


admission, graduation requirements, and 
Matters, 

Seldom is a subject introduced in our general faculty meetings 
that some member does not utilize test results as part of his argu- 
mentation or suggest further use of this type of evidence. While it 
Would be difficult to verify this point, the writer believes that a 


testing program, in which nation-wide tests are used, gives the 


faculty and administration a feeling of security, of self-respect, 
and of belonging to a group of forward-looking colleges. This 
should not be interpreted as meaning that the achievements are 
always satisfying, that is, that they are as much as we should like 
to see, but rather that they furnish objective guidelines for setting 
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future goals. Without such measures, faculty members would have 
their own individual ways of determining the intellectual status 


of the college, and often these estimates are entirely subjective 
and illusory. 


his deficiencies, especially in his general education. Also, for many 


ent for determining levels of attainment and, in 


some instances, for granting advanced credit for demonstrated pro- 


ficiency, 


ical Examination, which yields three scores 
c, and a total Score; (2) the Cooperative 
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test battery. Students who make a high enough score are excused 
from all general education courses with the exception of English 
Composition. 

Since 1934 the college has admitted selected students who have 
completed three years of senior high school work with selection 
based on high marks in high school, recommendation of their high 
school principals, interview, high scores on the freshman test 
battery, and high scores on the Iowa High School Content Exam- 
ination. Since 1951 this project has been carried out under a sub- 
vention of the Ford Foundation Fund for the Advancement of 
Education, Mature students who have not had the opportunity to 
complete their high school education may also be admitted by 
taking the same battery of tests as the accelerated freshmen and 
achieving a required standing. 

In addition to purposes of admissions, the test data serve 
other purposes: For example, students who make scores equal to 
the 65th percentile (national end-of-sophomore-year norms) in the 
Cooperative test battery, designed annually for use at the sopho- 
more level, may apply for advanced standing credit for general ed- 
Ucation courses, Students are placed in certain sections of freshman 
English on the basis of their records in high school English and 
Scores on the Mechanics of English Test. Those who are exception- 
ally low in this subject are required to take a noncredit course, 
Fundamentals of English, before entering regular freshman Eng- 
lish. Similarly, students who make high scores on local chemistry 
tests are placed in advanced courses. The same procedure is fol- 
lowed in the modern language courses, and students may be ex- 
empt from the language requirement for A.B. degrees by passing 
language qualifying examinations. Students applying for advanced 
Standing in any field are required to take standardized tests, if 
available, in addition to departmental examinations. Also, fresh- 
man scholarships are awarded on the basis of high test scores, high 
Scholarship standing, and evaluation in interviews. For special 
guidance purposes, students are given other tests, such as the 
Kuder Preference Record or the Strong Vocational Interest Blank, 
or the Wechsler-Bellevue Intelligence Scale. i 

Since most of the general education courses are concentrated in 
the first two years, students are required to take sophomore com- 
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prehensives at the end of the second year for admission to the sen- 
ior college. These comprehensives are made up of the Cooperative 
General Culture Test and the tests in Mechanics of Expression 
and Effectiveness of Expression. The graduation requirement in- 
cludes taking the Graduate Record Area Examination, which is 
comprised of area tests in the social sciences, humanities, and 
natural sciences, and an advanced test in the major field. 

A further illustration of the use of tests is exemplified by a pilot 
study in the education of elementary teachers. During the past 
three years the college has been conducting this study under a 
grant from the Ford Foundation Fund for the Advancement of 
Education. Students are selected for the program on the basis of 
their general average in undergraduate colleges and their standing 
in a series of tests: Graduate Record Area Examination, American 
Council Psychological Examination, Kuder Preference Record, 
and the Minnesota Multiphasic Personality Inventory. 


NEEDS AND PROBLEMS 

This college owes a deep debt of gratitude to the various testing 
agencies that have contributed so many tests of national scope. As 
has been indicated, published tests are of particular value to 
medium-sized colleges that must carry on an effective educational 
program with limited funds. 

Twenty-five years’ experience in the use of tests at this college 
reveals a real need for the improvement of existing tests and the 
design of new ones, and it is felt that improvement in our testing 
program might be realized by availability of the following: 


1. An up-to-date comprehensive high school test—a measure of 
high school graduates; 


- A more effective diagnostic readin 
uates; 


3. More comprehensive general education tests which should in- 


clude: (a) measurements of attitudes and critical thinking and 
(b) more content of a contemporary nature and less emphasis 
on the historical aspects of subject matter; 
4. A test to appraise English composition on the freshman level; 
5. A group test to reveal emotional disturbances of freshmen; 


nm 


g test for high school grad- 


6. 


7. 
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Language tests other than German and Romance languages, 


such as Greek, Russian, and so on; 
New (or revised) tests for standard college subjects. 


It is also felt that the testing services rendered at the college 


would be improved by: 


I, 


N 


A better understanding on the part of faculty members of the 
nature of tests given and more effective utilization by faculty 
members of test results in counseling and in evaluating student 
growth—also faculty members need to become more expert in 
test construction in order that they may supplement national 
tests with those adapted to local objectives; 

A better understanding on the part of students of the role that 
testing plays in the furtherance of their own educational objec- 
tives; 

Increased funds for research on the effectiveness of the testing 
program of this college, as well as for a number of local and 


special studies; and 


. An increased staff of well-trained personnel to administer test- 


ing services. 


12. The Testing Program of the Counseling 
Bureau of the University 
of Minnesota 


RALPH F. BERDIE, Director of the Student Counseling Bureau, 
Office of the Dean of Students 


SINCE THEIR INCEPTION THE TESTING PROGRAMS AT THE UNIVERSITY 
of Minnesota have served the broader counseling purposes of the 
school. Although few if any large institutions of higher learning 
can claim to have an all-pervasive educational philosophy, the 
student personnel point of view has been one of the dominant 
influences in the development of the educational program at Min- 
nesota.’ In recognition of the significance of psychological individ- 
ual differences, and as the result of the effective pioneer work of a 
few farsighted university administrators and faculty members, the 
need for adequate counseling has been acknowledged by many staff 
members and by proportionately even more students. 

From its earlier days the Minnesota counseling program has 
been based upon research, and the needs and characteristics of stu- 
dents have been carefully studied and analyzed.? Much of this re- 
search has been based on psychological tests, and many of these 
tests, originally used for research purposes, have been incorporated 
into the counseling program of the university, 

The many testing programs within the university are reviewed 
periodically to determine how the tests can contribute to the effec- 

TE. G; Williamson, “Counseling and the Minnesota Point of View,” Educational 
and Psychological 


Measurement, 7:141-55, 1947; E. G. Williamson, et al, The 
Student Personnel Point of View (Washington: American Council on Education, 
1949). 


*E. G. Williamson and J. G. Darley, Student Personnel Work (New York: Me 
Graw-Hill Book Co., 1937); C. D. Williams, These We Teach (Minneapolis: Uni 
versity of Minnesota Press, 1943). 
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tive counseling of students. Purposes other than counseling require 
the extensive use of tests. For instance, admission to some colleges 
of the university is based in part upon test scores. Students also 
sometimes are classified into various courses on the basis of test 


scores. Students who wish to graduate from some of the colleges of 


the university with honors are required to attain specified scores 


on certain tests. Many other programs involve the use of special 
tests, and perhaps all the purposes for which tests are used in other 
colleges and universities are found also at Minnesota. In every 
instance, however, consideration is given to the possible use of tests 
for counseling, even when these tests are being administered for 
other reasons. 

As a result of this counseling emphasis on tests, on 
Stand the counseling programs at Minnesota if one is to under- 
Stand the testing programs. A number of centralized university 
Offices provide counseling services to all students. In addition to 
these centralized offices are other offices in which counselors are 
concerned only with students from certain divisions of the uni- 
versity. All counseling services and student personnel services are 
Coordinated by the Office of the Dean of Students. The dean of 
students also is administratively responsible for some of the cen- 
tralized counseling offices, including the Student Counseling Bu- 
reau, the Bureau of Loans and Scholarships, the Discipline Coun- 
seling Office, the adviser of foreign students, the Veterans’ Coun- 
seling Center, the Student Housing Bureau, the Student Activities 
Bureau, and the Speech and Hearing Clinic. Other departments 
which provide counseling to all university students include the 
Mental Hygiene Clinic within the Student H ealth Service and the 
Student Employment Bureau. . 

Although professional counselors in these central counseling 
offices see thousands of students each year, perhaps the largest 
number of students are advised and counseled, not in these cen- 
tral programs, but rather in other counseling and advisory pro- 
Srams in the university, particularly faculty advisory programs and 
residential counseling programs. Central and professional counsel- 
Ing services were developed in the university to supplement faculty 
advisory programs rather than to substitute for these programs; 
and much of the time and effort of professional counselors on the 


e must under- 
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campus is devoted to helping faculty advisers, administrators, and 
others provide better services to students. ; , 

The Student Counseling Bureau is responsible for making avail- 
able to other personnel officers and to faculty members and admin- 
istrators information about students that will best help the univer- 
sity meet the varied needs of individuals. To obtain this informa- 
tion and to make it available to persons who can use it, the Student 
Counseling Bureau administers several testing programs. One of 
these programs extends down into the high schools and involves 
the administration of over one million tests to pupils in grades 
nine through twelve in Minnesota schools. Another series of pro- 
grams has as its purpose the providing of information to college 
admissions officials to assist them in making decisions regarding 
individual applicants. Another testing program is more closely 
related to the orientation program for the new students and in- 
volves the administration of ability, interest, and personality tests 
to entering students for the use of counselors working with these 
students. A clinical testing service also is maintained for the indi- 
vidual testing of students referred by counselors or other staff 
members. A number of other special testing programs are admin- 
istered by the bureau. 

In addition to providing information to others in the university 
who work with students, the Student Counseling Bureau also pro- 
vides specialized professional counseling services to students; prac- 
ticum training in counseling to graduate students;? in-service train- 
ing in counseling and testing to college, university, and high school 
faculties; and consultation to high schools and colleges regarding 
problems related to testing and counseling. 

On the staff of the bureau, in addition to the administrative, 
supervisory, and clerical workers 


» are counselors specializing 10 
marriage counseling, 


educational skills and remedial reading; 
counseling physically handicapped students, vocational counseling 
and occupational information, social psychology and group dynam- 
ics, and clinical psychology and psychotherapy. Counselors with 
these specialties do not work only in these restricted fields, but 
rather, as they counsel all types of students with all types of prob- 


`R. F. Berdie and Theda Hagenah, “A Training Program in Counseling,” Ameri- 
can Psychologist, 5:140-42, 1950. 


UNIVERSITY OF MINNESOTA 165 


lems, they also receive specific referrals of students with problems 
in their areas, conduct related research, and assist other counselors 
in broadening and deepening their counseling skills. Not all coun- 
selors have specialties as listed above—some work more intensively 
than others with programs such as residence counseling or orienta- 
tion programs or with special groups of faculty members or stu- 
dents. Supporting the work of the counselors are psychometrists, 
Statisticians, IBM operators, research assistants and clinical fellows, 
and a secretarial staff. In 1958, the bureau employed sixteen pro- 
fessional psychologists, four psychometrists, and thirty-one others. 

Perhaps the quickest way to comprehend the nature of the uni- 
versity’s testing and counseling programs is to follow the progress 
of a fictitious student through the University. Jerry Smith had 
his first contact with the university’s testing programs when he was 
in the ninth grade. His school at that time chose to participate in 
a university testing program and administered to all ninth-grade 
students the Differential Aptitude Tests and four Cooperative 
Achievement Tests in mathematics, social studies, natural science, 
and English. The test supplies were provided to the high school by 
the university. The tests were administered in the high school, 
scored by the university, and reported back to the high school. The 
scores then were recorded on the pupil’s cumulative records and 
used by Jerry's counselors and teachers. 

Help was provided to the high school staff by the university 50 
test scores could be used meaningfully. In each school were copies 
of a manual for the state-wide testing program* providing informa- 
tion about the tests. Three times a year the Student Counseling 
Bureau Bulletin and Occupational Newsletter was sent to high 
Schools and university departments to inform them of recent de- 
Vvelopments in testing and counseling. Research reports were sent 
to the high schools periodically. Every two years the characteristics 
of Minnesota high school and college students were analyzed and 


these reports circulated.° 


*R. F. Berdie, Wilbur L. Layton, and Thed: 
4 Manual for the State-Wide Testing Programs of Minnesota 

‘ounseling Bureau, University of Minnesota, 1953). 

R. F. Berdie, W. L. Layton, and E. O. Swanson, 
Tests Used in the Minnesota State-Wide Testing Program a 
Aptitude in Minnesota Colleges” (Mimeographed; Minneapo! 
Sota, 1956), 
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High school counselors and principals and Sa ee 
are invited to conferences on the university campus to discuss prol 
lems of counseling and testing. Occasionally, the bureau helps in 
conducting conferences elsewhere in the state, and bureau — 
bers participate in many professional meetings in Minnesota. M any 
visits are made by bureau staff members to high schools to discuss 
testing programs, individual students and their problems, coun- 
seling records, and counseling methods. Much of this work has 
been done in conjunction with the university’s College of Educa- 
tion and the State Department of Education. , 

To return to our student: in the tenth grade the school, again 
making use of the university’s testing program, administered = 
Jerry and all other tenth-graders the Iowa Tests of Educationa 
Development and the Minnesota Counseling Inventory. Again, 
these scores were recorded on Jerry’s cumulative record and re- 
ferred to by several of his teachers an y : 
eleventh grade all pupils were given the Minnesota Scholastic Apti- 
tude Test? and the Cooperative English Test as part of a testing 
program administered by the Student Counseling Bureau of the 
university and sponsored by the Association of Minnesota Col- 
leges, of which the university is a member.’ After these tests were 
scored, all the scores for Minnesota high school juniors were re- 
ported to all member colleges of the association and Jerry’s scores 
were reported back to his high school so that his counselors could 
help him plan during his senior year for his post-high-school ac- 


tivities. During the twelfth grade, the high school administered 
to Jerry the Strong Vocational Interest Blank 


the Iowa Tests of Educational Development : 
Counseling Inventory. On the basis of changes in test scores, Jerry's 
counselor was better able to help discuss plans 


for the future. ; 
Jerry then applied for admission to the College of Science, Lit- 
erature, and the Arts at the university, a 


d by his counselors. In the 


and readministered 
and the Minnesota 


and the Admissions Office 


°R. F. Berdie and W, L. Layton, 
York: Psychological Corporation, 1957). 


€W. L. Layton, “Construction of a Short Form of the Ohio State University 
Psychological Examination’ 


” (Mimeographed; Minneapolis: University of Minnesota, 
1956). 


q 
The Minnesota Counseling Inventory (New 


“R. F. Berdie, “Guidan 


ce between School and College,” in College Admissions. 
(Princeton: College Entran 


ce Examination Board, 1956). 
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referred to his high school percentile rank and his score on the 
college aptitude test before he was admitted. When he received 
his admission notification, he was encouraged by his high school 
counselor to talk with a counselor in the Student Counseling Bu- 
reau prior to the beginning of the university year; Jerry came to 
the bureau and there had three interviews with a counselor. At 
that time the counselor had Jerry take additional tests, including 
a reading test, in the bureau testing room. On the basis of this 
test, and other information, the counselor encouraged Jerry to 
spend some time developing his reading skills in the Educational 
Skills Clinic of the bureau. Before the beginning of class, Jerry 
also attended the two-day orientation and advanced registration 
program of the university. Here, he took another college aptitude 
test and the Minnesota Multiphasic Personality Inventory. Other 
students who had not taken the Strong Vocational Interest Blank 
in high school were given the test at that time. Students entering 
other colleges were administered other groups of tests, depending 
upon the colleges in which they were registering. 

All entering freshmen take the interest test. All students, but 
those in one college, take the Minnesota Multiphasic Personality 
Inventory; those in the other college take the Minnesota Counsel- 
ing Inventory. Students in the College of Science, Literature, and 


the Arts take an additional college aptitude test. In the Institute 


of Technology they take an algebra test and, sometimes, the Lay- 
General College students are 


ton Engineering Aptitude Test. 

given the General Aptitude Test Battery of the U. S. Office of 
Employment. Education students take the Minnesota Teacher 
Attitude Inventory and the Cooperative Reading Comprehension 
Test. Scores on all these tests are reported to college offices and are 
available to faculty advisers. 

During his first two years in the Arts College, 
some work in the Educational Skills Clinic and two or 
spoke with the counselor he had seen prior to the b 
School. Each quarter he also met with his college adviser, 
helped him plan his academic program and discussed other mat- 
ters with him. 


At the completion of his sophomore year, 
Majoring in one of the departments that required al 


Jerry completed 
three times 
eginning of 
who 


Jerry was considering 
l its students 
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to take the Cooperative General Culture Test; and, consequently, 
Jerry took this test and his scores were reported to the college — 
and used there by the adviser who discussed with Jerry senior col- 
lege alternatives. . 

Let us assume now that Jerry had decided he would study medi- 
cine. At the time he applied for admission to the Medical School, 
he was informed that he would have to take two groups of tests. 
The national tests required by most medical colleges and admin- 
istered by the Educational Testing Service are given by the Stu- 
dent Counseling Bureau, and another group of tests required by 
the University of Minnesota Medical School are also administered 
by the bureau. These test scores all were made available to the 
Medical School Admissions Committee and also to Jerry's coun- 
selor, who was in a position to discuss with him their counseling 
implications, . 

When Jerry was tested in the eleventh grade, a permanent uni- 
versity record card was initiated and his eleventh-grade test scores 
were recorded on this card. When he was given the interest test in 
the twelfth grade, this fact was noted on his card and a copy of his 
interest profile was placed in the files. From that time, every time 
Jerry took a test in the university, his scores were entered on the 
basic record card. The tests taken at the request of Student Coun- 
seling Bureau counselors were includ 
and kept in an adjacent file. 

Jerry’s faculty adviser had made available to him by the college 
office the eleventh-grade test scores and the scores of tests taken dur- 
ing the orientation program. These scores are reported as a matter 
of routine to the college offices by the Student Counseling Bureau. 
At any time that Jerry’s adviser wanted additional test information 
from the Student Counseling Bureau, this was available by calling 
the Faculty-Student Contact Desk, the division of the bureau that 
coordinates all available information concerning counseling con- 


tacts and test scores. When Jerry's adviser called the Contact Desk 
for additional i 


ed in his counseling folder 


> but also the names of any other coun- 
en Jerry. In this instance, he 
elor also had talked with Jerry 
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and, in turn, the bureau counselor would be notified of the call to 
the desk from Jerry’s adviser. 

Not every student goes through this rather intensive testing. For 
instance, many high schools do not use all of these tests. Many uni- 
versity students do not see a bureau counselor and consequently 
do not take those tests. Students entering certain colleges within 
the university are not required to take admissions tests and, con- 
sequently, for many students rather meager test information is 
available. For practically all university students, however, some 
information is available concerning general academic aptitude 
and, for an increasing number, information is available concern- 
ing measured vocational interests and measured personality char- 
acteristics. 

We should repeat that test scores are of little value unless people 
are trained and motivated to use them properly. Consequently, 
much of the professional counselors’ time is devoted to working 
with individual university staff members to assist them in under- 
Standing the meaning of test scores. The bureau organizes and 
conducts systematic programs that bring together groups of staff 
members to present them with an opportunity to discuss testing 
problems. For instance, during one year a series of several meetings 
was held to which all faculty members and personnel workers were 
invited to discuss vocational interest measurement. During anothe1 
year a similar series of meetings was devoted to the discussion of 
personality tests and during still another year, the discussion cen- 
tered around the measurement of academic aptitude. The Student 
Counseling Bureau Newsletter is distributed to approximately five 
hundred university staff members and reports current information 
on the use of tests. Professional counselors also work closely with 
resident counselors and much time is spent when resident coun- 
selors consult with bureau counselors concerning individual cases 


and counseling problems. — ; 

A testing program of this nature can be maintained ata high 
level only if the research upon which it is based also thrives. New 
tests are constantly being developed. During recent years a new 
Personality test, the Minnesota Counseling Inventory; a new aca- 
demic aptitude test, the Minnesota Scholastic Aptitude Test; and 
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a new engineering aptitude test, the Layton Engineering Aptitude 
Test, have been standardized. Prediction studies are conducted 
periodically to provide to counselors and other staff members cur- 
rent information concerning the predictive power of tests.” An 
analysis recently has been completed showing the changes or lack 
of changes in the characteristics of college students during the past 
twenty-five years. Another series of experiments has been com- 
pleted exploring the use of tests and related information in group 
counseling situations.?° 
In summary, the primary purpose of the testing programs at the 

University of Minnesota is to provide to students, their instructors, 
and their counselors information that will best help in the solution 
of educational, vocational, and personal problems. University test- 
ing also serves many other purposes, but only this primary purpose 
provides justification for the extensive program here described. 

°R, F. Berdie and N. A. Sutter, “Predicting Success of Engineering Students,” 
Journal of Educational Psychology, 14:184-90, 1950; R. F. Berdie and W. L. Layton, 

Predicting Success in Law School,” Journal of Applied Psychology, 4:257-G0. 
1952; W. L. Layton, “Predicting Success of Students in Veterinary Medicine,” 
Journal of Applied Psychology, 36:312-15, 1952; W. L. Layton, and E. O. Swanson, 
A Follow-up of Minnesota State-Wide Program Test Results in the Institute of 
Technology (Mimeographed; Minneapolis: University of Minnesota, 1957). 


"D. P. Hoyt, “An Evaluation of Group and Individual Programs in Vocational 
Guidance,” Journal of Applied Psychology, 39:26-30, 1955, 


13. The Testing Program of 
Pasadena City College 


= B. LANGSDORF, President 
LORENCE BRUBAKER, Dean of Student Personnel 


PASADENA Crry COLLEGE 1s A TWO-YEAR PUBLIC JUNIOR COLLE 
with an enrollment of 4,600 students in its day program. The stu- 
dent body includes students who wish to parallel the first two years 
ofa four-year college or university program, students who wish to 
take one or two years of specialized vocational training, and stu- 
dents who wish to make up high school grade or subject deficien- 
cies. Last year’s student body represented 849 high schools, 48 
states, and 64 foreign countries. 

Under California law public junior college 
school graduates and all others over the age of eighteen w 
college believes may profit from the education offered. TI 
dent body is, therefore, unselected and represents a wide range of 
abilities and interests. At one end of the scale in scholastic ability 
are the one hundred to one hundred and fifty students in each class 
who have an A— or better high school grade average. At the other 
end are a few students with 1Q’s as low as 80 or 90. A major func- 
tion of the public junior college is guidance and placement of stu- 
dents in curricula for which they are qualified. Pasadena City 
College, while not selective in admissions, is selective in admission 
to specific curricula and courses, primarily on the basis of high 
school achievement or achievement in junior college. 

The basic testing program in such a college, therefor 
Selective admissions device, but rather an attempt to pro 
form basis for evaluating the achievements and interests of its stu- 
dents, as well as to obtain an objective appraisal of individual 
scholastic aptitude and ability. It supplements scholastic achieve- 
ment in class placement. i 

The present basic testing 


s must accept all high 
hom the 


he stu- 


re, is not a 
vide a uni- 


program is a minimum one, all fresh- 
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men taking the American Council oe 
the fourth week of the semester in connection with a require 
Basic Communication class. The Cooperative English Test, Me- 
chanics of Expression, is given to all students in freshman English 
classes during the first week of the semester and is used, together 
with instructor evaluations, to verify placement in regular or re- 
medial classes. All freshmen also are given a speech, a library, and 
a listening test (developed by members of the faculty) in their 
Basic Communication classes. . 

Specialized individual testing (on interest, aptitude, ability, pro 
jective personality) is available through the services of a psychol- 
ogist assisted by a psychometrist. Such testing may be requested 
by students, counselors, the health center, or members of the fac- 
ulty and administration, all requests being channeled through the 
dean of student personnel. Through individual counselor inter- 
views, students plot profile charts of their scores on the ACE Psy- 
cological Examination and Cooperative English Tests. Results of 
specialized tests are released to appropriate staff personnel and 
interpreted to students by the psychologist. Faculty members are 
asked to consult the student’s counselor or the psychologist for 
personal data regarding students. 

There is a growing tendency to use achievement tests as place- 
ment tools in specific areas of science and mathematics. Such tests 
are administered by the psychometrist prior to registration. The 
ACE Psychological Examination and Cooperative English Tests 
probably should, and soon may, be given prior to registration. The 
Lado English Language Test for Foreign Students? has been found 
to be most helpful in the counseling of foreign students. This also 
will be administered by the psychometrist prior to registration. 

The present year is a transition period for testing practices at 
Pasadena City College. In addition to the psychologist and nine 
full-time counselors who have long served the student body, : 
full-time psychometrist has just been added to the staff. It is 
planned that the basic testing program, as well as many of the apti- 
tude and personality tests, will be administered by him, freeing 
more of the time of the psychologist for test interpretation, psycho- 
logical counseling, and casework. General supervision of the stu- 


* With the withdrawal recently of this ex 
several other tests as possible replacements, z 
? Published by George Wahr Publishing Co., 316 South State St, Ann Arbor, Mich. 


amination, Pasadena is now considering 
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dent testing services is assigned to the dean of student personnel. 
The psychometrist organizes and administers the group testing 
program for entering students in cooperation with the dean of 
student personnel, the psychologist, and the coordinator of Basic 
Communication (an orientation program required of freshmen); 
administers and scores placement tests in specialized areas as re- 
quested; administers individual interest, ability, and aptitude tests 
as requested by administrators, psychologist, counselors, and teach- 
ers; and administers specialized tests of the General Aptitude Test 
Battery of the U. S. Office of Employment or proficiency tests as 
requested by the Placement Office. 

The psychologist administers tests such as the Wechsler Adult 
Intelligence Scale, the Strong Vocational Interest Blank, Diagnos- 
tic Reading Tests, Leiter Adult Intelligence Scale,’ and a variety 
of aptitude and projective personality tests. 

Test data are filed in the student’s folder in the office of his coun- 
selor, or, in some instances, kept in the confidential file of the psy- 
chologist. Upon request, such data are interpreted to faculty by the 
student’s counselor. They are used by the counselor in advising stu- 
dents with special regard to their educational and vocational plan- 
ning, Some of the more general test data, such as ACE Psycho- 
logical Examination, the Cooperative English Test, and the 


listening and speech tests, are studied by the student in his Basic 


Communication class. 

Among the problems facing junior colleges relative to measure- 
ment are the following: 

1. What constitutes a minimum, yet adequate, testing program? 

2. How can we detect, through a minimum group testing program, 
those students with reading or other handicaps whose potential 
(or true) ability is not shown on available group tests with high 
verbal loading? 

3. Do the wide discrepancies often found between the sco 
on individual tests, such as the Wechsler Adult Intelligence 
Scale and available group tests, indicate that a new type of 
group test is needed in situations where students present widely 
variant personal and educational backgrounds? 

4. How can we provide adequate vocational interest testing (such 
as the Strong Vocational Interest Blank) at a feasible cost? 

Homan Ave., Chicago 24, Il. 


res made 
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14. The Testing Program of 
San Francisco State College’ 


ALAN W. JOHNSON, Associate Dean of Students 
F. GRANT MARSH, Coordinator of Testing 
JOSEPH AXELROD, Curriculum Evaluator 


SAN FRANCISCO STATE COLLEGE IS ONE OF TEN INSTITUTIONS MAKING 
up the California state college system. This system is supported by 
public funds and is, of course, subject to legal control of the State 
Board of Education and the laws of California. 

The primary educational functions of the state colleges are to 
provide: (1) liberal education, with an emphasis on general edu- 
cation in the lower division: (2) teacher education, providing pre- 
service and in-service education of teachers for the public schools; 
(3) occupational training through curricula requiring four or 
five years of college training; and (4) preprofessional education 


with training leading to graduate work in the major professions 
and research fields. 


San Francisco State College h 


as been a leader in the develop- 
ment of a specially designed 


general education program required 
of all students. The major fields of study offered by the eight aca- 
demic divisions of the college provide a wide variety of liberal 
education and occupational training. The college is authorized to 
award bachelor’s degrees in the arts, education, science, and voca- 
tional education, the master of arts degree, and a large variety of 
teaching credentials. 

The college is located on a ninety- 


three-acre campus in San 
Francisco and draws most of its studen 


t population from the San 


*Alan W. Johnson and F., Grant Marsh, staff members of the Office of the Dean 
of Students, are Tesponsible for the section of this paper dealing with admissions 
and credential tests; Joseph Axelrod, a staff member of the Office of the Dean of 
Instruction, is responsible for the section dealing with the use of tests and other 
evaluation instruments in the instructional program. 
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Francisco Bay Area. In the fall 1957 semester there were enrolled 
on campus approximately 6,100 full-time students and 3,700 part- 
time students; in addition, 3,700 students are enrolled in off-cam- 
pus extension courses. New regular students enter in almost 
equally divided numbers as freshmen or as transfer students from 
junior colleges or from other four-year institutions. 


ADMISSIONS AND CREDENTIAL TESTING 


_ Admission standards at San Francisco State College are specified 
in the California Administrative Code, which provides uniform 
admissions regulations for all California state colleges. These reg- 
ulations are, of course, complex, but a summary of them will be 
useful for those involved in admissions testing at other institutions. 


Policies concerning admission to the college 


A high school graduate is admitted if he has completed the 
equivalent of seven Carnegie units of course work in subjects other 
than physical education and military science, with grades of A or 
B on a five-point scale during the last three years in high school. 
If the high school graduate is able to present only the equivalent 
of five Carnegie units instead of seven, he may still gain admission 
if he attains at minimum the 20th percentile on the national col- 
lege freshman norms of a standard college aptitude test. 

Veterans and applicants over twenty-one ycars of age who are 
not high school graduates but whose scores on the college entrance 
examinations indicate ability to do satisfactory college work may 
be granted admission. An applicant who has earned credit in other 
colleges and universities may be admitted if he meets standards 
as follows: (1) he must have a grade-point average of at least 2.0 
(grade C on a five-point scale) on the total program attempted, or 
(2) he may receive special consideration if he attains the 20th per- 
centile on the national norms of a standard college aptitude test. 

A student may be admitted to graduate standing if he holds a 
bachelor’s degree from an accredited institution. For admission to 
degree program, he must have a B aver- 
must have taken the Graduate 
e evidence of a foundation 
program, 


candidacy for the master’s 
age in all postbaccalaureate work, 
Record Area Examination, and must giv 
in his field sufficient to indicate probable success in the 
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Testing of applicants for admission to the college 


All applicants for admission as regular students, except pa 
uates of four-year accredited colleges and universities and appli- 
cants from foreign lands, are required to take Gitanes exami 
tions prior to final action on their applications. Applicants = 
tested in large groups on specific dates which are scheduled 
throughout the year. ; 

The entrance battery includes: (1) the School and College Abil- 
ity Tests (the American Council Psychological Examination was 
used prior to 1956-57) and (2) the Cooperative English Tests, 
Mechanics of Expression and Reading Comprehension. These tests 
were selected because of their ease of administration to large 
groups and because they seemed to be the best available to furnish 
information about the applicants’ ability to pursue college work 
and to detect any deficiencies in their ability to handle the English 
language. 

Occasionally the Admissions Office may request the coordinator 
of testing to retest an applicant. There are various reasons for such 
requests, a common one arising when an applicant meets all the 
qualifications for admission but performs poorly on the entrance 
examinations. It may be that he has been away from school or col- 
lege for some time or that his Carnegie units were earned in 
courses other than in academic subjects (for example, music, art, 
shop, typing, and so on). Usually the Ohio State Psychological Test, 
the Wechsler-Bellevue Intelligence Scale, the Henmon-Nelson 
Tests of Mental Ability (college), and the ACE Psychological Ex- 


amination are used for retesting. The selection of the test is deter- 
mined by the nature of the case ar 


nd the time available for admin- 

istering it. 
All freshman students are re 
three-unit courses in Basic 
reading, writing, 


quired to take a sequence of two 
Communication to develop skill in 
speaking, and listening, Undergraduate transfer 
students who have not completed six units in English composition 
are required to take three or six units in Basic Communication. 
Remedial laboratories in reading and writing are available to help 
students improve reading and study techniques and to improve the 
effectiveness of their writing. Scores on the Cooperative English 
Tests in Reading Comprehension and Mechanics of Expression 
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are used by the Testing Office in determining which students 
should enroll in these laboratories, in addition to regular enroll- 
ment in the required courses in Basic Communication. Tenta- 
tively a cutting scaled score of 55 on these tests has been estab- 
lished. These tests have been in use only two years, and no studies 
have been completed on their effectiveness, although two are now 
in process. In addition to the reading and writing tests, a short test 
s given to new students prior to registra- 


in speech performance 1 ‘ 
hich among them should be 


tion for the purpose of discovering w 
referred to a remedial speech laboratory. 
The results of all tests in the entrance battery are av 


the use of administrative staff and faculty. 

Prior to 1956-57, the ACE Psychological Examination was in- 
cluded in the entrance battery. No studies were made of its effec- 
tiveness in predicting success. The School and College Ability 
Tests (SCAT) have been in use for one year. With the installation 
of IBM equipment this year, it is hoped that a study of the effec- 
tiveness of SCAT can be made. 


ailable for 


Admission to teacher education programs and candidacy 
for teaching credentials 


Candidates for admission to teacher education curri 
judged on the following bases: intelligence, scholarship, satisfac- 
tory completion of at least two years of college level work, profes- 
sional aptitude, physical fitness, speech and language usage, per- 


sonality and character, and diversity of interests. Concerning the 


first factor in the foregoing list, California State law reads: “Any 
candidate who falls below the 25th percentile on the national col- 
lege norms of a generally recognized college aptitude test must 
demonstrate compensating strength in other qualities.” Results on 
SCAT or on the ACE Psychological Examination (prior to fall 
1956) taken at the time of entrance, are furnished to the Education 
Division for use in weighing candidates on the intelligence factor. 
Under the law, the factors of professional aptitude and personality 
and character are to be evaluated by tests, observations, and inter- 
views, and determination of the results are to be made by commit- 
tee action. Use of tests to assess professional aptitude, personality, 
and character of credential candidates has been very limited. 


cula are 
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No person may be employed as teacher or administrator in a 
California public school unless he holds a credential issued by the 
State Board of Education. Requirements for the credential are 
established by the board, and provide, among other things, that 
credential candidates shall demonstrate proficiency not only in oral 
and written language but in several other areas as well. San Fran- 
cisco State College has prescribed a battery of competency tests 
which students must pass; if they fail to do so, they must success- 
fully complete a course which covers the subject matter and skills 
measured by the test. Students are advised to take the competency 
tests at the close of their sophomore year or as soon thereafter as 
possible, so that deficiencies may be corrected before they enter 
their professional programs. Students may not enroll in student 
teaching courses until all the required competencies have been 
demonstrated and candidacy for the credential has been granted. 


The competency tests administered to elementary credential 
candidates are outlined below: 


1. Locally constructed tests covering the fields of art, geography, music, 
nature study, physical education, speech, and written English skills. 
2. Published tests: 

a) Until the fall of 1957 the Iowa Every-Pupil Test of Basic Arith- 
metic Skills, advanced battery, Forms L, M, and N were em- 
ployed. National norms were used, and the ninth-grade equiv- 
alent was required. 

Beginning in the fall of 1957 the Q score on SCAT has been 
substituted for the Iowa Every-Pupil Test in arithmetic. 

b) Cooperative English Test, Reading Comprehension, Forms T 
and Z; local norms are used for the different college years; can- 
didates must attain the 40th percentile for passing. 


The competency tests administered to secondary credential can- 
didates are as follows: 


l. Locally constructed tests co 
English skills. 
2. Published tests: 
a) Cooperative English Test, Re 
and Z; local norms 


vering the fields of speech and written 


ading Comprehension, Forms T 

: rms are used for the different college years; candi- 

dates must attain the 40th percentile for passing. 

b) Cooperative Test of General Culture; three of the five parts must 
be completed at a satisfactory level. 
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Admission to other curricula 

. Systematic use of admissions test results in determining admis- 
sion of candidates to curricula other than teacher education is 
carried on in only one major field of work. This is the clinical 
science curriculum for the training of laboratory technicians. Ap- 
plicants for this course of study must rank above the 40th percen- 
tile on national norms on the ACE Psychological Examination or 
on SCAT. The results on this and other tests are, however, used 
extensively for educational and vocational counseling and by ma- 
jor advisers working with their advisees. 


THE USE OF TESTS IN THE INSTRUCTIONAL PROGRAM 


In addition to the wide use of tests at San Francisco State Col- 
lege in its entrance battery for freshmen and transfer students seek- 


ing admission, in the competency testing of candidates for the 
nd in the admission of candidates to the 


ege has, of course, found testing indis- 
pensable for its instructional program. The use of tests has been 
important not only in serving as a main means of evaluating stu- 
dent progress and achievement for the purpose of assigning course 
grades,? but it has also been important, especially during the last 
half-dozen years, in the systematic attempts by our instructors and 
course staffs to judge the effectiveness of their instruction. 


credential program, a 
graduate curricula, the coll 


The role of a central office in self-appraisal 
projects by instructional staff 

An Office of Curriculum Evalu 
primary purpose was to encourage and help initiate self-appraisal 
programs in every curricular area ready for such scrutiny, and to 
aid individual instructors, in whatever ways it could, in their ef- 
forts to improve the effectiveness of their own courses. The office 
is staffed by a curriculum evaluator and his secretary. Additional 


ation was set up in 1951. Its 


among our faculty who teach gen- 


bout three years ago : 
discovered that for 36 


bering over one hundred), it was 3 
percent of the faculty, test scores had an 85 percent, or greater, weight in determin- 
ing the course grade and that, for well over half the faculty, test scores had a 55 
percent, or greater, weight in determining the course grade. Only 11 percent of the 
faculty indicated the weight given to test scores in determining the course grade 


Was less than 15 percent. 


? In a survey carried on a 
eral education courses (num 
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aid is obtainable when needed, both in the way of faculty advice 
and help and of clerical assistance. During the first several years 
of the existence of the office, an all-college Advisory Committee 
on Evaluation met monthly with the evaluator, but once the office 
achieved stability in its functioning and once its services to faculty 
members became well known, such a faculty committee no longer 
proved necessary. 

‘It is important to realize that, at San Francisco State College, the 
evaluator does not function as an administrative officer but that he 
and his office are wholly designed to be of service to the several 
divisions of the college and their individual faculty members. He 
plays no part in such matters as staffing and scheduling of courses 
and promotions. He is a member of the staff of the Office of the 
Dean of Instruction, but, while he may help a staff or instructor 
prepare or select appropriate tests for a project in self-appraisal or 
may help in the design of the project and in the collection and 
interpretation of data, he does not submit to the dean of instruc- 
tion the findings of projects he helps individual instructors carry 
out. This type of relationship to the faculty members whom he 
serves and to the dean under whose general supervision he works is 
imperative if instructors are to continue to make use of the evalu- 
ator’s services for the purpose of discovering their weaknesses and 
strengths as teachers. 

; In other aspects of his work, the evaluator may play a more offi- 
cial role. For example, suppose one of the teaching staffs brings a 
proposal for a change in some aspect of the instructional program 
to the committee for consideration, and the committee decides it 
needs more data. It may ask the curriculum evaluator to procure 
the data and report directly to the committee, or it may ask the 
course staff to consult with the evaluator on the best way of pro- 


ceeding to collect the data and later to report back to the commit- 
tee when the data are compiled. 


From the preceding 
ciples can be inferred 
riculum Evaluation is 
struction or use of test 
analysis of strengths 


description, some of the philosophic prin- 
upon which the work of the Office of Cur- 
built. Its ultimate goal is neither the con- 
s, nor the compilation of data, nor even the 
and weaknesses in a present or proposed 
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course or curriculum; these are all means toward its ultimate goal, 
which is the improvement of instruction.® 


Faculty self-appraisal projects involving 
tests of student achievement 

Many of the evaluation projects have, of course, begun with the 
collection of data on student achievement. We must admit that 
the problem of discovering or constructing achievement tests ade- 
quate for our purposes has remained a major problem. Where the 
skill or the body of knowledge over which mastery is being tested 
is, in our view, adequately measured in a published test, it is ob- 
viously a great advantage to us to purchase and use it. If we were 
to attempt to construct and refine our own instrument, we would 
probably have a less reliable and, in the end, much more expen- 
sive one, But with the exception of the tests mentioned in succeed- 
ing paragraphs, published tests have not generally suited our pur- 
poses, especially where we have needed them most—namely, in 
the four broad areas of our general education goals. 

Our staffs are, however, most interested in the publication of 
folios of large numbers of test items (rather than individual tests) 
from among which they may select the exercises which, in their 
judgment, test for the goals held significant by the staff itself. In 
a subsequent section dealing with our television experiment in 
general education courses, mention will be made of the use to 
which such a folio in the field of the natural sciences (published 
by the Educational Testing Service) has already been put. 

Some of the instruments which have been used to collect data on 
student achievement for analysis in various evaluation projects are 
as follows: The Health Inventories (Cooperative Test Division of 
the Educational Testing Service); the English Structure Test (de- 
veloped at the University of Michigan and used in projects with 
our overseas students); the Iowa Silent Reading Test; the Univer- 
sity of California Subject A Examination; the Diagnostic Reading 
Tests (Educational Records Bureau). A project appraising im- 


provement in logical reasoning used a locally developed instru- 
2 For a full statement of the philosophy of evaluation too briefly summarized 
here, see Joseph Axelrod, “Evaluation versus Mumblety-Peg: How To Appraise a 
Program in Curriculum Evaluation,” Educational Record, 35:305-12, October 1954. 
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ment adapted from several of the latest forms of the Progressive 
Education Association instruments used in the Eight-Year Study.* 
A project attempting to measure growth in critical thinking used 
a locally constructed essay test, in which a special scoring pro- 
cedure was worked out involving anonymity of both student and 
year (that is, whether pre- or post-test) and two independent read- 
ers for each essay booklet. In one project measuring the effective- 
ness of the general education two-semester required course in 
psychology, the entire department participated in the construction 
of an instrument for pre- and post-test purposes.* In addition to 
the foregoing, student achievement tests used in the television ex- 
periment will be listed in a later section. 


Evaluation projects involving instruments other 
than achievement tests 


The evaluation projects have not always, of course, called for the 
use of achievement tests. A number of the projects have sought to 
collect and analyze faculty opinion or student opinion on certain 
issues or courses. Aside from the questionnaire type of instrument, 
a number of projects have used sociometric tests or rating scales on 
which faculty judgments are recorded. Two projects, proceeding on 
the assumption that effectiveness of instruction in a multiple- 
section course was closely allied with the functioning of the course 
staff as a group working cooperatively, studied course staffs by 
means of such tests and scales. All of these instruments were locally 


constructed, although a number were actually local adaptations of 
instruments developed elsewhere.* 


“These were adapted to our needs by the Office of Curriculum Evaluation with 
the help of Dr. Hilda Taba, who was at that time a member of the college-wide 
Advisory Committee on Evaluation, 


* All these projects are described in detail in mimeographed “Evaluation Reports,” 


which were issued to the faculty. Copies are available on loan to interested readcrs. 
Summaries of all these are given in another mimeographed document, “A Report 
on Evaluation in General Education at San Francisco State College: Summaries of 
Twenty-Seven Evaluation Reports Submitted to the Committee on General Educa- 
tion,” also available on loan from the college. Some of the projects and some 
of the instruments used in them (with illustrations) are described by Joseph Axel- 
rod, “The Evaluation of the General Education Program at San Francisco State 


College,” in Paul L. Dressel (ed.), Evaluation in General Education Programs 
(Dubuque, Ia.: Wm. C. Brown Co., 1954). 


° These are partly reproduce: 
Education: Explorations in F: 
tion, 1954). 


d in Paul L. Dressel and Lewis B. Mayhew, General 
valuation (Washington: American Council on Educa- 
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A number of the projects used more than single tests. One am- 
bitious study, for example, evaluating the effectiveness of our gen- 
eral education course in Home and Family Living, used the Bell 
Ad justment Inventory (student form); the Mooney Problem Check 
List (college form); a locally constructed subject-matter test; a 
locally constructed projective test (incomplete sentence type); a 
student rating scale consisting of questions about the course and 
the instructor; recorded counseling interviews; and a course evalu- 


ation form.* 


Tests used in the television experiment in 
general education courses 

San Francisco State College is now carrying on experimentation, 
under a grant from the Fund for the Advancement of Education, 
to compare instruction by television with instruction through 
traditional media. In the first phase of the project (1956-57), the 
ere used in addition to tests of subject 
al Preference Schedule; Personal In- 
which is partly standardized;® Bills’ 
Index of Adjustment of Values, which is standardized but not 
available in published form;® the California Auding Test, Form F, 
revised, 1952, by Brown and Caffrey (published by the Council 
on Auding Research, Redwood City, California); and a locally 
Additional data on the students included 
ychological Examinations and SCAT as 


following instruments W 
matter: the Edwards Person 
ventory (Self-Insight Scale), 


devised sociometric test. 
test scores on the ACE Ps 
well as information not obtained through tests. 

In the second phase of the television experiment, currently 
under way, similar instruments are being employed. One of the 
achievement tests, the Hills Economics Test, is standardized, but 


7 The research for this evaluation project was done as part of a doctoral disserta- 
tion by Duncan V. Gillics, “Three Methods of Teaching a College Course in Home 
and Family Living in a General Education Program” (Unpublished, Stanford Uni- 
versity, 1952). 

Two other doctoral dissertations were, in fact, evaluation studies of aspects of the 
general education program: Florence C. Haimes, “Physical Sciences in the General 
Education Program of a State College,” and Arthur J. Hall, “An Evaluation of a 
College Course in Occupational Development” (both Stanford University: 1952 and 
1949 respectively). 

5 Personal Inventory (Self-Insight Scale) appe 
ogy, 219-36, November 1948. 

’ Bills’ Index of Adjustment of Values 
Psychology, 257-61, 1951. 


ared in the Journal of Social Psychol- 


appeared in the Journal of Consulting 
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the others are locally constructed. One of these, covering the field 
of natural sciences, consists of items selected from a published test 
folio. In addition, the project is using: the Watson-Glaser Criti- 
cal Thinking Appraisal; Individual Inventory (Self-Insight), de- 
veloped and partly standardized by L. Grose, Syracuse University; 
the Edwards Personal Preference Schedule; Attitude Scales, devel- 
oped at Miami University, Oxford, Ohio; a Degree of Interest 


Scale, developed at the Pennsylvania State University; and socio- 
metric tests locally devised. 


Tests used by instructors for assigning course grades 


About three years ago, the Curriculum Evaluation Office can 
ried on a survey of faculty opinion about the tests they were using 
in their courses. A questionnaire was used, which was answered 
anonymously. The survey was carried on among the faculty teach- 
ing one or more sections of a general education course; this in- 
volved more than a third of the total faculty. 

When the faculty members were asked whether they felt their 
examinations were successfully measuring the degree to which stu- 
dents were attaining the basic purposes of their courses, only 37 
percent answered “Yes” and an additional 19 percent answered 
“Yes, with reservations.” It is interesting that this percentage was 
smaller than that given in a similar survey made a year and a half 
before, and that in the interim attempts had been made to achieve 


greater awareness among faculty members of the problems in test 
construction for general education courses, 

Another question asked: 
organized (with membershi 
of improving testing 


“If an informal faculty seminar were 

P completely voluntary) to study ways 

procedures and appraisal of student achieve- 
"Paul L. Dressel and Clarence H. N 

Test Item Folio No. 1 (Princeton, N.J.: Educational Testing Service, 1957). 

se of the project appeared in 1958: Robert E. 


2 catty, An Experimental Study of College Instruction 
Using Broadcast Television Project N $ e : 


Project director and evaluator for 1956-57; 
ilson, who are, respectively, project director and 
evaluator currently, 
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ment in your area, do you think it would be worth your while to 
attend some or all of its sessions?” Over two-thirds of the faculty 
answered “Yes.” Unfortunately, it was impossible to carry out 
plans for a full-fledged seminar, but the curriculum evaluator met 
with several of the course staffs to discuss the construction of tests 
in their area, and meetings between faculty members and the 
evaluator, individually or in very small groups, to discuss such 
problems, are part of the regular routine of the Evaluation Office. 
The evaluator frequently instructs faculty members in the me- 
chanics of converting a hand-scored objective test to a machine- 
scored one and in the rewording of items. It is inevitable and 
desirable that more than merely mechanical features be treated. 
Several of the questions in the faculty survey sought to discover 
from instructors whether students could average C or better on 
their examinations through memorization alone. Well over a third 
of the faculty responded that this appeared to them to be true. 
Two additional pieces of data gathered by the questionnaire 
concerned a problem that is frequently encountered in the con- 
struction of teacher-made objective tests and in the grading of 
essay tests. Instructor-made tests of the objective type often use 
ambiguous wording so that the better students are often penalized. 
The evaluator, in working on this problem with faculty members 
discovered, as he expected to, that this feature was substantially 
improved after attempts were made to persuade instructors to add 
an indispensable step in the process of test construction: to ask one 
or two colleagues to run through the items in their draft form. 
It was gratifying therefore to discover, in the faculty survey, that 
almost half of the instructors who used tests of the objective type 
(40 percent indicated that they used the essay type only) said they 
did generally carry on this practice before the typing of their final 
test copy. 


On the essay type of test, the equivalent problem is that of 


reliability in grading. And the equivalent solution is for instruc- 
e or all of the papers inde- 


tors to ask a colleague to grade som 1 
pendently (in accordance with the original instructor’s criteria for 
judging the essays) and to analyze the extent and bases of disagree- 
ments in the grades assigned. Here, too, attempts were made, al- 


186 DESCRIPTIONS OF PROGRAMS 


ways on the informal level, to demonstrate that this was a worth- 
while practice. In the survey, about one-third of the faculty mem- 
bers who used essay tests (26 percent indicated they used the objec- 
tive type of test only) said they had tried using this method. A 
number who said they had not indicated, however, in their writ- 
ten comments that they used some other method of ensuring 
minimum reliability in the grading of essay tests.” 

Some institutions have attempted to solve a number of the 
problems discussed here by setting up an examiner’s office, stalled 
by testing experts who are given the responsibility of constructing 
adequate examinations and administering them to students. Such 
an office is particularly appropriate for programs in which many 
of the courses are multiple-section courses, as they are in the gen- 
eral education program at San Francisco State College. One of the 
evaluation reports, in the process of analyzing the problem of 
achieving common grading standards in a multiple-section course 
with a fairly large staff, presents a review of the arguments for and 
against the centralized examination system as a solution to prob- 
lems in testing and grading.** The college, however, has never seri- 
ously considered the possibility of setting up a central examining 
agency as the primary means of certifying that students have ful- 
filled the college requirements in general education. It is never- 
theless true that from time to time some of the course staffs have 


considered the possibility of constructing and using common tests 
in their multiple-section course. 


We do not have recent data on this matter, but the earlier of the 
two faculty surv 


practice on this 
faculty teaching 
had used test ex 


eys referred to above gives us faculty opinion and 
question in 1952. At that time, 60 percent of the 
in the general education program said that they 
ercises in their exams which were prepared for 
common use by staff members, even though this was not a regular 
practice. An attempt, too, was made to discover how many of the 
faculty believed that a certain amount of common testing materials 


* The mimeographed Teport of this survey, Evaluation Report No. 6: “Faculty 
Opinion on Testing Problems in the General Education Courses” is available on 
loan from the college. 


“Evaluation Report No, 7: 


ards in General Education C 
college. 


“The Problem of Course Grades and Grading Stand- 
ourses” (mimeographed); available on loan from the 
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ought to be used by every instructor teaching different sections of 
their course. Almost 80 percent of the faculty said that they shared 
this belief. The faculty was further asked its opinion on the spe- 
cific amount that should be common to all instructors in the 
course. Almost 40 percent said they could not answer the question; 
but of those who had an opinion, three-fourths said they believed 
half or more of the examinations in their general education 
courses should consist of common test exercises. 

Subsequently, a number of the general education course staffs 
did work out common test exercises, but their use has been com- 
pletely voluntary, and the degree to which the score on any test 
determines the student’s course grade remains entirely a decision 


of the individual instructor. 

One of the objects in setting up a separ 
to relieve instructors of the task of constructing their own tests and 
giving their own grades; and one of the arguments in favor of this 
policy is that testing and appraising student achievement is a task 
which, to be effectively performed, requires an amount of time 
and the possession of skills which, on the whole, college instructors 
are unwilling to give and do not possess. The policy at San Fran- 
cisco State is, however, not to relieve the instructor of these tasks, 
but rather to take steps which will enable him to perform the ap- 
praisal tasks more adequately than most college instructors are 
apparently able to do at the present time. f 

In pursuing this policy, not only the Office of Curriculum 
Evaluation and the Testing Office but also a large number of the 
administrative officers of the college have given encouragement 


ate examiner’s office is 


and service. 
it must be said, have become especially interested in 


"$ r sta mem TS, 
ome of our staff members, epela I E Have been 


ms. Outside the field of psychology, whose st 
specially trained in the techniques, oe efforts are worth noting here. One Boa 
member has been interested in developing instruments that are machines 
and also open-book; another has developed exercises for his own courses using he 
Objective Test of Essay Answers technique expounded by Leo dyedelshy. 

tion of Essays by Objective Tests,” Journal of General Education, 209-2 


testing proble 
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